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MATERIALS AND METHODS FOR 
THE MODIFICATION OF PLANT LIGNIN CONTENT 

5 Technical Field of the Invention 

This invention relates to the field of modification of lignin content and 
composition in plants. More particularly, this invention relates to enzymes involved in 
the lignin biosynthetic pathway and nucleotide sequences encoding such enzymes. 

0 Background of the Invention 

Lignin is an insoluble polymer which is primarily responsible for the rigidity of 
plant stems. Specifically, lignin serves as a matrix around the polysaccharide 
components of some plant cell walls. The higher the lignin content, the more rigid the 
plant. For example, tree species synthesize large quantities of lignin. with lignin 

5 constituting between 20% to 30% of the dry weight of wood. In addition to providing 
rigidity, lignin aids in water transport within plants by rendering cell walls hydrophobic 
and water impermeable. Lignin also plays a role in disease resistance of plants by 
impeding the penetration and propagation of pathogenic agents. 

The high concentration of lignin in trees presents a significant problem in the 

0 paper industry wherein considerable resources must be employed to separate lignin 
from the cellulose fiber needed for the production of paper. Methods typically 
employed for the removal of lignin are highly energy- and chemical-intensive, resulting 
in increased costs and increased levels of undesirable waste products. In the U.S. alone, 
about 20 million tons of lignin are removed from wood per year. 

5 Lignin is largely responsible for the digestibility, or lack thereof, of forage 

crops, with small increases in plant lignin content resulting in relatively high decreases 
in digestibility. For example, crops with reduced lignin content provide more efficient 
forage for cattle, with the yield of milk and meat being higher relative to the amount of 
forage crop consumed. During normal plant growth, the increase in dry matter content 

0 is accompanied by a corresponding decrease in digestibility. When deciding on the 
optimum time to harvest forage crops, farmers must therefore chose between a high 
yield of less digestible material and a lower yield of more digestible material. 



WO 98/11205 



PCT/NZ97/00112 



For some applications, an increase in lignin content is desirable since increasing 
the lignin content of a plant would lead to increased mechanical strength of wood, 
changes in its color and increased resistance to rot. Mycorrhizal species composition 
and abundance may also be favorably manipulated by modifying lignin content and 

5 structural composition. 

As discussed in detail below, lignin is formed by polymerization of at least three 
different monolignols which are synthesized in a multistep pathway, each step in the 
pathway being catalyzed by a different enzyme. It has been shown that manipulation. of 
the number of copies of genes encoding certain enzymes, such as cinnamyl alcohol 

10 dehydrogenase (CAD) and caffetc acid 3 -O-methy transferase (COMT) results in 
modification of the amount of lignin produced; see, for example. U.S. Patent No. 
5,451.514 and PCT publication no. WO 94/23044. Furthermore, it has been shown that 
antisense expression of sequences encoding CAD in poplar leads to the production of 
lignin having a modified composition (Grand, C. et al. Planta (BerU 163 :232-237 

15 (1985)). 

While DNA sequences encoding some of the enzymes involved in the lignin 
biosynthetic pathway have been isolated for certain species of plants, genes encoding 
many of the enzymes in a wide range of plant species have not yet been identified. 
Thus there remains a need in the art for materials useful in the modification of lignin 
20 content and composition in plants and for methods for their use. 

Summary of the Invention 

Briefly, the present invention provides isolated DNA sequences obtainable from 
eucalyptus and pine which encode enzymes involved in the lignin biosynthetic 
25 pathway, DNA constructs including such sequences, and methods for the use of such 
constructs. Transgenic plants having altered lignin content and composition are also 
provided. 

In a first aspect, the present invention provides isolated DNA sequences coding 
for the following enzymes isolated from eucalyptus and pine: cinnamate 4-hydroxylase 
30 (C4H), coumarate 3-hydroxylase (C3H), phenolase (PNL), Ornethyl transferase 
(OMT), cinnamyl alcohol dehydrogenase (CAD). cinnamoyl-CoA reductase (CCR) r 
phenylalanine ammonia-lyase (PAL), 4-coumarate:CoA ligase (4CL). coniferoi 
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glucosyl transferase (CGT), coniferin ma-glucosidase CCBG). laccase (LAO and 
peroxidase (POX), together with ferulate-5-hydroxyiase (F5H) from eucalyptus. In one 
embodiment, the isolated DNA sequences comprise a nucleotide sequence selected 
from the group consisting of: (a) sequences recited in SEQ ID NO: 3. 13. 16-70, and 
5 72-88; (b) complements of the sequences recited in SEQ ID NO: 3. 13. 16-70. 72-88: 

(c) reverse complements of the sequences recited in SEQ ID NO: 3. 13. 16-70, 72-88; 

(d) reverse sequences of the sequences recited in SEQ ID NO: 3, 13, 16-70, 72-88: and 

(e) sequences having at least about a 99% probability of being the same as a sequence 
of (a) - (d) as measured by the computer algorithm FASTA. 

10 In another aspect, the invention provides DNA constructs comprising a DNA 

sequence of the present invention, either alone, in combination with one or more of the 
inventive sequences or in combination with one or more known DNA sequences; 
together with transgenic cells comprising such constructs. 

In a related aspect, the present invention provides DNA constructs comprising, 

5 in the 5*-3' direction, a gene promoter sequence; an open reading frame coding for at 
least a functional portion of an enzyme encoded by the inventive DNA sequences or 
variants thereof; and a gene termination sequence. The open reading frame may be 
orientated in either a sense or antisense direction. DNA constructs comprising a non- 
coding region of a gene coding for an enzyme encoded by the above DNA sequences or 

0 a nucleotide sequence complementary to a non-coding region, together with a gene 
promoter sequence and a gene termination sequence, are also provided. Preferablv, the 
gene promoter and termination sequences are functional in a host plant. Most 
preferably, the gene promoter and termination sequences are those of the original 
enzyme genes but others generally used in the an, such as the Cauliflower Mosaic 

5 Virus (CMV) promoter, with or without enhancers, such as the Kozak sequence or 
Omega enhancer, and Agrobacterium tumefaciens nopal in synthase terminator mav be 
usefully employed in the present invention. Tissue-specific promoters may be 
employed in order to target expression to one or more desired tissues. In a preferred 
embodiment, the gene promoter sequence provides for transcription in xylem. The 

0 DNA construct may further include a marker for the identification of transformed cells. 
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In a further aspect, transgenic plant cells comprising the DNA constructs of the 
present invention are provided, together with plants comprising such transgenic cells, 
and fruits and seeds of such plants. 

In yet another aspect, methods for modulating the lignin content and 

5 composition of a plant are provided, such methods including stably incorporating into 
the genome of the plant a DNA construct of the present invention. In a preferred 
embodiment, the target plant is a woody plant, preferably selected from the group 
consisting of eucalyptus and pine species, most preferably from the group consisting of 
Eucalyptus grandis and Pimts radiata. In a related aspect, a method for producing a 

10 plant having altered lignin content is provided, the method comprising transforming a 
plant cell with a DNA construct of the present invention to provide a transgenic cell, 
and cultivating the transgenic cell under conditions conducive to regeneration and 
mature plant growth. 

In yet a further aspect, the present invention provides methods for modifying the 

15 activity of an enzyme in a plant, comprising stably incorporating into the genome of the 
plant a DNA construct of the present invention. In a preferred embodiment, the target 
plant is a woody plant, preferably selected from the group consisting of eucalyptus and 
pine species, most preferably from the group consisting of Eucalyptus grandis and 
Pinus radiata. 

20 The above-mentioned and additional features of the present invention and the 

manner of obtaining them will become apparent, and the invention will be best 
understood by reference to the following more detailed description, read in conjunction 
with the accompanying drawing. 

25 Brief Description of the Figures 

Fig. 1 is a schematic overview of the lignin biosynthetic pathway. 

Detailed Description 

Lignin is formed by polymerization of at least three different monolignols, 
30 primarily para-coumaryl alcohol, coniferyl alcohol and sinapyl alcohol. While these 
three types of lignin subunits are well known, it is possible that slightly different 
variants of these subunits may be involved in the lignin biosynthetic pathway in various 
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plants. The relative concentration of these residues in lignin varies between different 
plant species and within species. In addition, the composition of lignin may also vary 
between different tissues within a specific plant. The three monolignols are derived 
from phenylalanine in a multistep process and are believed to be polymerized into 
lignin by a free radical mechanism. 

Fig. 1 shows the different steps in the biosynthetic pathway for coniferyl alcohol 
together with the enzymes responsible for catalyzing each step. /?ara-Coumaryl alcohol 
and sinapyi alcohol are synthesized by similar pathways. Phenylalanine is first 
deaminated by phenylalanine ammonia-lyase (PAL) to give cinnamate which is then 
hydroxylated by cinnamate 4-hydroxyiase (C4H) to form /7-coumarate. p-Coumarate is 
hydroxylated by coumarate 3-hydroxyiase to give caffeate. The newly added hydroxyl 
group is then methylated by O-methyl transferase (OMT) to give ferulate which is 
conjugated to coenzyme A by 4-coumarate:CoA ligase (4CL) to form feruloyl-CoA. 
Reduction of feruloyl-CoA to coniferaldehyde is catalyzed by cinnamoyl-CoA 
reductase (CCR). Coniferaldehyde is further reduced by the action of cinnamyl alcohol 
dehydrogenase (CAD) to give coniferyl alcohol which is then converted into its 
glucosylated form for export from the cytoplasm to the cell wall by coniferol glucosyl 
transferase fCGT). Following export, the de-glucosylated form of coniferyl alcohol is 
obtained by the action of coniferin 6efa-glucosidase (CBG). Finally, polymerization of 
the three monolignols to provide lignin is catalyzed by phenolase (PNL), laccase (LAC) 
and peroxidase (POX). 

The formation of sinapyi alcohol involves an additional enzyme, ferulate-5- 
hydroxylase (F5H). For a more detailed review of the lignin biosynthetic pathway, see: 
Whetton, R. and Sederoff R., The Plant Cell . 7:1001-1013 (1995). 

Quantitative and qualitative modifications in plant lignin content are known to 
be induced by external factors such as light stimulation, low calcium levels and 
mechanical stress. Synthesis of new types of lignins, sometimes in tissues not normally 
iignified. can also be induced by infection with pathogens. In addition to lignin, several 
other classes of plant products are derived from phenylalanine, including flavonoids, 
coumarins, stilbenes and benzoic acid derivatives, with the initial steps in the synthesis 
of all these compounds being the same. Thus modification of the action of PAL, C4H 
and 4CL may affect the synthesis of other plant products in addition to lignin. 
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Using the methods and materials of the present invention, the lignin content of a 
plant can be increased by incorporating additional copies of genes encoding enzymes 
involved in the lignin biosynthetic pathway into the genome of the target plant. 
Similarly, a decrease in lignin content can be obtained by transforming the target plant 
5 with antisense copies of such genes. In addition, the number of copies of genes 
encoding for different enzymes in the lignin biosynthetic pathway can be manipulated 
to modify the relative amount of each monolignol synthesized, thereby leading to the 
formation of lignin having altered composition. The alteration of lignin composition 
would be advantageous, for example, in tree processing for paper, and may also be 
10 effective in altering the payability of wood materials to rotting fungi. 

In one embodiment, the present invention provides isolated complete or partial 
DNA sequences encoding, or partially encoding, enzymes involved in the lignin 
biosvnthetic pathway, the DNA sequences being obtainable from eucalyptus and pine. 
Specifically, the present invention provides isolated DNA sequences encoding the 
15 enzymes CAD (SEQ ID NO: 1, 30), PAL (SEQ ID NO: 16), C4H (SEQ ID NO: 17), 
C3H (SEQ ID NO: 18), F5H (SEQ ID NO: 19-21), OMT (SEQ ID NO: 22-25), CCR 
(SEQ ID.NO: 26-29), CGT (SEQ ID NO: 31-33), CBG (SEQ ID NO: 34), PNL (SEQ 
ID NO: 35. 36), LAC (SEQ ID NO: 37-41) and POX (SEQ ID NO: 42-44) from 
Eucalyptus grandis; and the enzymes C4H (SEQ ID NO: 2, 3, 48, 49), C3H (SEQ ID 
20 NO: 4. 50-52), PNL (SEQ ID NO: 5, 81), OMT (SEQ ID NO: 6. 53-55), CAD (SEQ ID 
NO: 7. 71), CCR (SEQ ID NO: 8, 58-70), PAL (SEQ ID NO: 9-1 1,45-47), 4CL (SEQ 
ID NO: 12, 56, 57), CGT (SEQ ID NO: 72), CBG (SEQ ID NO: 73-80), LAC (SEQ ID 
NO: 82-84) and POX (SEQ ID NO: 13, 85-88) from Pinus radiata. Complements of 
such isolated DNA sequences, reverse complements of such isolated DNA sequences 
25 and reverse sequences of such isolated DNA sequences, together with variants of such 
sequences, are also provided. DNA sequences encompassed by the present invention 
include cDNA, genomic DNA, recombinant DNA and wholly or partially chemically 
synthesized DNA molecules. 

The definition of the terms "complement'\ "reverse complement" and "reverse 
30 sequenced as used herein, is best illustrated by the following example. For the 
sequence 5 % AGGACC 3\ the complement, reverse complement and reverse sequence 
are as follows: 
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complement 3' TCCTGG 5* 

reverse complement 3' GGTCCT 5' 

reverse sequence 5' CCAGGA 3\ 

As used herein, the term "variant" covers any sequence which exhibits at least 
5 about 50%. more preferably at least about 70% and, more preferably yet, at least 
about 90% identity to a sequence of the present invention. Most preferably, a 
"variant" is any sequence which has at least about a 999c probability of being the 
same as the inventive sequence. The probability for DNA sequences is measured bv 
the computer algorithm FASTA (version 2.0u4. February 1996; Pearson W. R. et aL 

10 Proc. Natl. Acad. ScL 8^:2444-2448, 1988), the probability for translated DNA 
sequences is measured by the computer algorithm TBLASTX and that for protein 
sequences is measured by the computer algorithm BLASTP (Altschul, S. F. et al. L 
Mol. Biol .. 215:403-410, 1990). The term "variants" thus encompasses sequences 
wherein the probability of finding a match by chance (smallest sum probability) in a 

15 database, is less than about 1% as measured by any of the above tests. 

Variants of the isolated sequences from other eucalyptus and pine species, as 
well as from other commercially important species utilized by the lumber industry, 
are contemplated. These include the following gymnosperms, by way of example: 
loblolly pine Pinus taeda. slash pine Pinus elliotti, sand pine Pinus clausa. longleaf pine 

20 Pinus palustrus % shortleaf pine Pinus echinata, ponderosa pine Pinus ponderosa, Jeffrey 
pine Pinus Jeffrey, red pine Pinus resinosa, pitch pine Pinus rigida. jack pine Pinus 
banksiana y pond pine Pinus serotina, Eastern white pine Pinus strobus, Western white 
pine Pinus monticola. sugar pine Pinus lambertiana, Virginia pine Pinus virginiana. 
lodgepole pine Pinus contorta, Caribbean pine Pinus caribaea, P. pinaster^ Calabrian 

25 pine P. bruxia, Afghan pine P. eldarica, Coulter pine P. coulteri, European pine P. 
nigra and P. sylvestris: Douglas-fir Pseudotsuga menziesii\ the hemlocks which include 
Western hemlock Tsuga heterophylla, Eastern hemlock Tsuga canadensis. Mountain 
hemlock Tsuga mertensiana: the spruces which include the Norway spruce Picea abies. 
red spruce Picea rubens, white spruce Picea glauca, black spruce Picea mariana, Sitka 

30 spruce Picea sitchensis, Englemann spruce Picea engelmanni, and blue spruce Picea 
pungens\ redwood Sequoia sempervirens; the true firs include the Alpine fir Abies 
lasiocarpa, silver fir Abies amabilis, grand fir Abies grandis, noble fir Abies procera. 
white fir Abies concolor. California red fir Abies magnifica. and balsam fir Abies 
halsamea, the cedars which include the Western red cedar Thuja plicata, incense 
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cedar libocedrus decunens* Northern white cedar Thuja occidentalism Port Orford cedar 
Chamaecyparis lawsoniona, Atlantic white cedar Chamaecyparis thyoides, Alaska 
yellow-cedar Chamaecyparis nootkatensis. and Eastern red cedar Huniperus virginiana; 
the larches which include Eastern larch Larix laricina. Western larch Lartx 
occidentatis. European larch Larix decidua, Japanese larch Larix leptolepis, and 
Siberian larch Larix siberica: bold cypress Taxodium distichum and Giant sequoia 
Sequoia gigantea\ 

and the following angiosperms, by way of example: 

Eucalyptus alba, £. bancroftiu E. botyroides. E. hridgesiana, £. calophylla, £. 
camaldulensis, £. citriodora. £. cladocalyx. £. coccifera, £. cunisiL E. dalrympleana. £. 
deglupta. E. delagatensis, E. diversicolor. £. dunnii £. fici folia, £. globulus, E, 
gomphocephala. £. gunniL £. henryi, £. laevopinea* E. macarthuriu £. macrorhyncha. 
£. maculatcu E. margmatcu £. megacarpcu E. melliodora. £. nicholiu £. mrem, £. 
anglica. E. obliqum £. obtusiflorcu E. oreades, £. pauciflora. E. polybractea. E. regnans, 
£. resinifera, E. robusta. E. rudis. £. saligna, £. sideroxylon. E. stuartiana £. tereticornis. 
£. torelliana, £. urnigera, £. urophyllcu E. viminalis, E. viridis, E. wandoo and £. 
you/7? a/irti. 

The inventive DNA sequences may be isolated by high throughput sequencing 
of cDNA libraries such as those prepared from Eucalyptus grandis and P/nu5 radiata 
as described below in Examples 1 and 2. .Alternatively, oligonucleotide probes based 
on the sequences provided in SEQ ID NO; 1-13 and 16-88 can be synthesized and 
used to identify positive clones in either cDNA or genomic DNA libraries from 
Eucalyptus grandis and Pinus radiata. or from other gymnosperms and angiosperms 
including those identified above, by means of hybridization or PCR techniques. 
Probes can be shorter than the sequences provided herein but should be at least 
about 10, preferably at least about 15 and most preferably at least about 20 
nucleotides in length. Hybridization and PCR techniques suitable for use with such 
oligonucleotide probes are well known in the art. Positive clones may be analyzed 
bv restriction enzyme digestion, DNA sequencing or the like. 

In addition, the DNA sequences of the present invention may be generated by 
synthetic means using techniques well known in the art. Equipment for automated 
synthesis of oligonucleotides is commercially available from suppliers such as Perkin 
Elmer/Applied Biosystems Division (Foster City, CA) and may be operated according 
to the manufacturer's instructions. 
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In one embodiment, the DNA constructs of the present invention include an 
open reading frame coding for at least a functional portion of an enzyme encoded by a 
nucleotide sequence of the present invention or a variant thereof. As used herein, the 
"functional portion" of an enzyme is that portion which contains the active site 
5 essential for affecting the metabolic step. i.e. the portion of the molecule that is capable 
of binding one or more reactants or is capable of improving or regulating the rate of 
reaction. The active site may be made up of separate portions present on one or more 
polypeptide chains and will generally exhibit high substrate specificity. The term 
"enzyme encoded by a nucleotide sequence" as used herein, includes enzymes encoded 
10 by a nucleotide sequence which includes the partial isolated DNA sequences of the 
present invention. 

For applications where amplification of lignin synthesis is desired, the open 
reading frame is inserted in the DNA construct in a sense orientation, such that 
transformation of a target plant with the DNA construct will lead to an increase in the 

15 number of copies of the gene and therefore an increase in the amount of enzyme. When 
down-regulation of lignin synthesis is desired, the open reading frame is inserted in the 
DNA construct in an antisense orientation, such that the RNA produced by transcription 
of the DNA sequence is complementary to the endogenous mRNA sequence. This, in 
turn, will result in a decrease in the number of copies of the gene and therefore a 

20 decrease in the amount of enzyme. Alternatively, regulation can be achieved by 
inserting appropriate sequences or subsequences (e.g. DNA or RNA) in ribozyme 
constructs. 

In a second embodiment, the inventive DNA constructs comprise a nucleotide 
sequence including a non-coding region of a gene coding for an enzyme encoded by a 

25 DNA sequence of the present invention, or a nucleotide sequence complementary to 
such a non-coding region. As used herein the term "non-coding region" includes both 
transcribed sequences which are not translated, and non-transcribed sequences within 
about 2000 base pairs 5' or 3' of the translated sequences or open reading frames. 
Examples of non-coding regions which may be usefully employed in the inventive 

30 constructs include introns and S'-non-coding leader sequences. Transformation of a 
target plant with such a DNA construct may lead to a reduction in the amount of lignin 
synthesized by the plant by the process of cosuppression. in a manner similar to that 
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discussed, for example, by Napoli et ai. ( Plant Ceil 2:279-290, 1990) and de Carvalho 
Niebel et al. ( Plant Cell 7:347-358, 1995). 

The DNA constructs of the present invention further comprise a gene promoter 
sequence and a gene termination sequence, operably linked to the DNA sequence to be 
5 transcribed, which control expression of the gene. The gene promoter sequence is 
generally positioned at the 5' end of the DNA sequence to be transcribed, and is 
employed to initiate transcription of the DNA sequence. Gene promoter sequences are 
generally found in the 5' non-coding region of a gene but they may exist in introns 
(Luehrsen, fC. R., Mol. Gen. Genet . 225 :8 1-93, 1991) or in the coding region, as for 
0 example in PAL of tomato (Bloksberg, 1991. Studies on the Biology of Phenylalanine 
Ammonia Lyase and Plant Pathogen Interaction. Ph.D. Thesis, Univ. of California. 
Davis, University Microfilms International order number 9217564). When the 
construct includes an open reading frame in a sense orientation, the gene promoter 
sequence also initiates translation of the open reading frame. For DNA constructs 
5 comprising either an open reading frame in an antisense orientation or a non-coding 
region, the gene promoter sequence consists only of a transcription initiation site having 
a RNA polymerase binding site. 

A variety of gene promoter sequences which may be usefully employed in the 
DNA constructs of the present invention are well known in the art. The promoter gene 
0 sequence, and also the gene termination sequence, may be endogenous to the target 
plant host or may be exogenous, provided the promoter is functional in the target host. 
For example, the promoter and termination sequences may be from other plant species, 
plant viruses, bacterial plasmids and the like. Preferably, gene promoter and 
termination sequences are from the inventive sequences themselves. 
5 Factors influencing the choice of promoter include the desired tissue specificity 

of the construct, and the timing of transcription and translation. For example, 
constitutive promoters, such as the 35S Cauliflower Mosaic Virus (CaMV 35S) 
promoter, will affect the activity of the enzyme in all parts of the plant. Use of a tissue 
specific promoter will result in production of the desired sense or antisense RNA only 
10 in the tissue of interest. With DNA constructs employing inducible gene promoter 
sequences, the rate of RNA polymerase binding and initiation can be modulated by 
external stimuli, such as light, heat, anaerobic stress, alteration in nutrient conditions 
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and the like. Temporally regulated promoters can be employed to effect modulation of 
the rate of RNA polymerase binding and initiation at a specific time during 
development of a transformed cell. Preferably, the original promoters from the enzyme 
gene in question, or promoters from a specific tissue-targeted gene in the organism to 
5 be transformed, such as eucalyptus or pine are used. Other examples of gene promoters 
which may be usefully employed in the present invention include, mannopine synthase 
(mas), octopine synthase (ocs) and those reviewed by Chua et al. ( Science . 244 : 1 74- 
181, 1989). 

The gene termination sequence, which is located 3' to the DNA sequence to be 

0 transcribed, may come from the same gene as the gene promoter sequence or mav be 
from a different gene. Many gene termination sequences known in the an mav be 
usefully employed in the present invention, such as the 3' end of the Agrobacterium 
tumefaciens nopaline synthase gene. However, preferred gene terminator sequences are 
those from the original enzyme gene or from the target species to be transformed. 

5 The DNA constructs of the present invention may also contain a selection 

marker that is effective in plant cells, to allow for the detection of transformed cells 
containing the inventive construct. Such markers, which are well known in the art, 
typically confer resistance to one or more toxins. One example of such a marker is the 
NPTII gene whose expression results in resistance to kanamycin or hygromycin, 

0 antibiotics which is usually toxic to plant cells at a moderate concentration (Rogers et 
al. in Methods for Plant Molecular Biology , A. Weissbach and H. Weissbach. eds.. 
Academic Press Inc., San Diego. CA (1988)). Alternatively, the presence of the desired 
construct in transformed cells can be determined by means of other techniques well 
known in the art, such as Southern and Western blots. 

5 Techniques for operatively linking the components of the inventive DNA 

constructs are well known in the art and include the use of synthetic linkers containing 
one or more restriction endonuclease sites as described, for example, by Maniatis et al., 
{Molecular Cloning; A Laboratory' Manual. Cold Spring Harbor Laboratories. Cold 
Spring Harbor, NY, 1989). The DNA construct of the present invention may be linked 

0 to a vector having at least one replication system, for example, £. co//. whereby after 
each manipulation, the resulting construct can be cloned and sequenced and the 
correctness of the manipulation determined. 
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The DNA constructs of the present invention may be used to transform a variety 
of plants, both monocotyledonous (e.g. grasses, corn, grains, oat, wheat and barley), 
dicotyledonous (e.g. Arabidopsis* tobacco, legumes, alfalfa, oaks, eucalyptus, maple), 
and Gyrnnosperms (e.g. Scots pine (Aronen. Finnish Forest Res. Papers, vol. 595, 

5 1996). white spruce (Ellis et aL Biotechnology ii:94-92. 1993), larch (Huang et aL In 
Vitro Cell 22-201-207. 1991). In a preferred embodiment, the inventive DNA 
constructs are employed to transform woody plants, herein defined as a tree or shrub 
whose stem lives for a number of years and increases in diameter each year by the 
addition of woody tissue. Preferably the target plant is selected from the group 

10 consisting of eucalyptus and pine species, most preferably from the group consisting of 
Eucalvptus grandis and Pinus radiata. As discussed above, transformation of a plant 
with a DNA construct including an open reading frame coding for an enzyme encoded 
by an inventive DNA sequence wherein the open reading frame is orientated in a sense 
direction will lead to an increase in lignin content of the plant or, in some cases, to a 

15 decrease by cosuppression. Transformation of a plant with a DNA construct 
comprising an open reading frame in an antisense orientation or a non-coding 
(untranslated) region of a gene will lead to a decrease in the lignin content of the 
transformed plant. 

Techniques for stably incorporating DNA constructs into the genome of target 
20 plants are well known m the an and include Agrobacterium tumefaciens mediated 
introduction, electroporation, protoplast fusion, injection into reproductive organs, 
injection into immature embryos, high velocity projectile introduction and the like. The 
choice of technique will depend upon the target plant to be transformed. For example, 
dicotyledonous plants and certain monocots and gyrnnosperms may be transformed by 
25 Agrobacterium Ti plasmid technology, as described, for example by Bevan (Nucl. Acid 
Res , J2:871 1-8721, 1984). Targets for the introduction of the DNA constructs of the 
present invention include tissues, such as leaf tissue, disseminated cells, protoplasts, 
seeds, embryos, meristematic regions: cotyledons, hypocotyls, and the like. One 
preferred method for transforming eucalyptus and pine is a biolistic method using 
30 pollen (see. for example, Aronen 1996, Finnish Forest Res. Papers vol. 595, 53pp) or 
easily regenerate embryonic tissues. Other transformation techniques which may be 
usefully employed in the inventive methods include those taught by Ellis et al. (Plant 
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Cell Reports . 8:16-20, 1989), Wilson et al. ( Plant Cell Reports 7:704-707, 1989) and 
Tautorus et al. ( Theor. Appl. Genet . 78:531-536, 1989). 

Once the cells are transformed, cells having the inventive DNA construct 
incorporated in their genome may be selected by means of a marker, such as the 

5 kanamycin resistance marker discussed above. Transgenic cells may then be cultured 
in an appropriate medium to regenerate whole plants, using techniques well known in 
the art. In the case of protoplasts, the cell wall is allowed to reform under appropriate 
osmotic conditions. In the case of seeds or embryos, an appropriate germination or 
callus initiation medium is employed. For explants. an appropriate regeneration 

10 medium is used. Regeneration of plants is well established for many species. For a 
review of regeneration of forest trees see Dunstan et al.. Somatic embryogenesis in 
woody plants. In: Thorpe, T.A. ed., 1995: in vitro embryogenesis of plants. Vol. 20 in 
Current Plant Science and Biotechnology in Agriculture, Chapter 12, pp. 471-540. 
Specific protocols for the regeneration of spruce are discussed by Roberts et al., 

15 (Somatic Embryogenesis of Spruce. In: Synseed. Applications of synthetic seed to crop 
improvement. Redenbaugh, K., ed. CRC Press, Chapter 23, pp. 427-449, 1993). The 
resulting transformed plants may be reproduced sexually or asexually, using methods 
well known in the art, to give successive generations of transgenic plants. 

As discussed above, the production of RNA in target plant cells can be 

20 controlled by choice of the promoter sequence, or by selecting the number of functional 
copies or the site of integration of the DNA sequences incorporated into the genome of 
the target plant host. A target plant may be transformed with more than one DNA 
construct of the present invention, thereby modulating the lignin biosynthetic pathway 
for the activity of more than one enzyme, affecting enzyme activity in more than one 

25 tissue or affecting enzyme activity at more than one expression time. Similarly, a DNA 
construct may be assembled containing more than one open reading frame coding for 
an enzyme encoded by a DNA sequence of the present invention or more than one non- 
coding region of a gene coding for such an enzyme. The DNA sequences of the present 
inventive may also be employed in combination with other known sequences encoding 

30 enzymes involved in the lignin biosynthetic pathway. In this manner, it may be 
possible to add a lignin biosynthetic pathway to a non-woody plant to produce a new 
woody plant. 
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The isolated DNA sequences of the present invention may also be employed as 
probes to isolate DNA sequences encoding enzymes involved in the lignin synthetic 
pathway from other plant species, using techniques well known to those of skill in the 
art. 

5 The following examples are offered by way of illustration and not by way of 

limitation. 

Example 1 

Isolation and Characterization of cDNA Clones from Eucalyptus qrandis 

10 Two Eucalyptus grandis cDNA expression libraries (one from a mixture of 

various tissues from a single tree and one from leaves of a single tree) were constructed 
and screened as follows. 

mRNA was extracted from the plant tissue using the protocol of Chang et al . 
( Plant Molecular Biology Reporter 11:1 13-116 (1993)) with minor modifications. 

15 Specifically, samples were dissolved in CPC-RNAXB (100 mM Tris-Cl, pH 8,0; 25 
mM EDTA; 2.0 M NaCl; 2%CTAB; 2% PVP and 0.05% Spermidine^ HCl)and 
extracted with Chloroform:isoamyl alcohol, 24: 1 . mRNA was precipitated with ethanol 
and the total RNA preparate was purified using a Poly(A) Quik mRNA Isolation Kit 
(Stratagene, La Jolla, CA). A cDNA expression library was constructed from the 

20 purified mRNA by reverse transcriptase synthesis followed by insertion of the resulting 
cDNA clones in Lambda ZAP using a ZAP Express cDNA Synthesis Kit (Stratagene), 
according to the manufacturer' s protocol. The resulting cDNAs were packaged using a 
Gigapack II Packaging Extract (Stratagene) employing 1 \x\ of sample DNA from the 5 
(il ligation mix. Mass excision of the library was done using XL 1 -Blue MRF cells and 

25 XLOLR cells (Stratagene) with ExAssist helper phage (Stratagene). The excised 
phagemids were diluted with NZY broth (Gibco BRL, Gaithersburg, MD) and plated 
out onto LB-kanamycin agar plates containing X-gal and isopropylthio-beta-galactoside 
(IPTG). 

Of the colonies plated and picked for DNA miniprep, 99% contained an insert 
30 suitable for sequencing. Positive colonies were cultured in NZY broth with kanamycin 
and cDNA was purified by means of alkaline lysis and polyethylene glycol (PEG) 
precipitation. Agarose gel at 1% was used to screen sequencing templates for 
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chromosomal contamination. Dye primer sequences were prepared using a Turbo 
Catalyst 800 machine (Perkin Elmer/Applied Biosystems. Foster City, CA) according 
to the manufacturer's protocol. 

DNA sequence for positive clones was obtained using an Applied Biosystems 

5 ; Prism 377 sequencer. cDNA clones were sequenced first from both the 5' end and, in 
some cases, also from the 3' end. For some clones, internal sequence was obtained 
using subcloned fragments. Subcioning was performed using standard procedures of 
restriction mapping and subcioning to pBluescript II SK> vector. 

The determined cDNA sequence was compared to known sequences in the 

o EMBL database (release 46, March 1996) using the FASTA algorithm of February 
1996 (version 2.0u4) (available on the Internet at the ftp site 
ftp://ftp.virginia.edu/pub/fasta/). Multiple alignments of redundant sequences were 
used to build up reliable consensus sequences. Based on similarity to known sequences 
from other plant species, the isolated DNA sequence (SEQ ID NO: 1) was identified as 

5 encoding a CAD enzyme. 

In further studies, using the procedure described above, cDNA sequences 
encoding the following Eucalyptus grandis enzymes were isolated: PAL (SEQ ID NO: 
16); C4H (SEQ ID NO: 17); C3H (SEQ ID NO: 18); F5H (SEQ ID NO: 19-21); OMT 
(SEQ ID NO: 22-25); CCR (SEQ ID NO: 26-29); CAD (SEQ ID NO: 30); CGT (SEQ 

0 ID NO: 31-33); CBG (SEQ ID NO: 34); PNL (SEQ ID NO: 35, 36); LAC (SEQ ID 
NO: 37-41); and POX (SEQ ID NO: 42-44). 

Example 2 

Isolation and Characterization of cDNA Clones from Pinus radiata 

5 

a) Isolation of cDNA clones bv high through-put screening 

A Pinus radiata cDNA expression library was constructed from xylem and 
screened as described above in Example 1. DNA sequence for positive clones was 
obtained using forward and reverse primers on an Applied Biosystems Prism 377 
0 sequencer and the determined sequences were compared to known sequences in the 
database as described above. 
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Based on similarity to known sequences from other plant species, the isolated 
DNA sequences were identified as encoding the enzymes C4H (SEQ ID NO: 2 and 3), 
C3H (SEQ ID NO: 4), PNL (SEQ ID NO: 5), OMT (SEQ ID NO: 6), CAD (SEQ ID 
NO: 7), CCR (SEQ ID NO: 8), PAL (SEQ ID NO: 9-1 1) and 4CL (SEQ ID NO: 12). 

5 In further studies, using the procedure described above, additional cDNA clones 

encoding the following Pinus radiata enzymes were isolated: PAL (SEQ ID NO: 45- 
47); C4H (SEQ ID NO: 48, 49); C3H (SEQ ID NO: 50-52); OMT (SEQ ID NO: 53- 
55); 4CL (SEQ ID NO: 56, 57); CCR (SEQ ID NO: 58-70); CAD (SEQ ID NO: 71); 
CGT (SEQ ID NO: 72); CBG (SEQ ID NO: 73-80); PNL (SEQ ID NO: 81); LAC 

0 (SEQ ID NO: 82-84); and POX (SEQ ID NO: 85-88). 

b) Isolation of cDNA clones by PCR 

Two PCR probes, hereinafter referred to as LNB010 and LNB01 1 (SEQ ID NO: 
14 and 15, respectively) were designed based on conserved domains in the following 
5 peroxidase sequences previously identified in other species: vanpox, hvupox6, taepox, 
hvupoxl, osapox, ntopox2, ntopoxl, lespox, pokpox, luspox, athpox, hrpox, spopox, 
and tvepox (Genbank accession nos. DU337, M83671, X56011, X58396, X66125, 
J02979, D11396, X71593, D1U02, L07554, M58381, X57564, Z22920, and Z3101L 
respectively). 

0 RNA was isolated from pine xylem and first strand cDNA was synthesized as 

described above. This cDNA was subjected to PCR using 4 \xM LNB010, 4 |iM 
LNB01 1, 1 x Kogen s buffer, 0.1 mg/ml BSA, 200 mM dNTP, 2 mM Mg 2 \ and 0.1 
U/|xl of Taq polymerase (Gibco BRL). Conditions were 2 cycles of 2 min at 94 °C, 1 
min at 55 °C and 1 min at 72 °C; 25 cycles of 1 min at 94 °C, 1 min at 55 °C, and 1 min 

25 at 72 °C; and 18 cycles of 1 min at 94 °C, 1 min at 55 °C and 3 min at 72 °C in a 
Stratagene Robocycler. The gene was re-amplified in the same manner. A band of 
about 200 bp was purified from a TAE agarose gel using a Schleicher & Schuell Elu- 
Quik DNA purification kit and clones into a T-tailed pBluescript vector (Marchuk D. et 
aL, Nucleic Acids Res . J_9:l 154, 1991). Based on similarity to known sequences, the 

30 isolated gene (SEQ ID NO: 13) was identified as encoding pine peroxidase (POX). 
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Example 3 

Use of an O-methvltransferase fOMTl Gene to Modify Lignin Biosynthesis 

5 a") Transformation of tobacco plants with a Pinus radiata OMT gene 

Sense and anti-sense constructs containing a sequence including the coding 
region of OMT (SEQ ID NO: 53) from Pinus radiata were inserted into Agrobacterium 
tumefaciens LBA4301 (provided as a gift by Dr. C. Kado, University of California, 
Davis. CA) by direct transformation using published methods (see, An G, Ebert PR, 

\0 Mitra A, Ha SB: Binary Vectors. In: Gelvin SB, Schilperoort RA (eds) Plant 
Molecular Biology Manual, Kluwer Academic Publishers, Dordrecht (1988)). The 
presence and integrity of the transgenic constructs were verified by restriction digestion 
and DNA sequencing. 

Tobacco (Nicotiana tabacum cv. Samsun) leaf sections were transformed using 

15 the method of Horsch et al. (Science, 227:1229-1231, 1985). Five independent 
transformed plant lines were established for the sense construct and eight independent 
transformed plant lines were established for the anti-sense construct for OMT. 
Transformed plants containing the appropriate lignin gene construct were verified using 
Southern blot experiments. A in the column labeled "Southern" in Table 1 below 

20 indicates that the transformed plant lines were confirmed as independent transformed 
lines. 

b) Expression of Pinus OMT in transformed plants 

Total RNA was isolated from each independent transformed plant line created 

25 with the OMT sense and anti-sense constructs. The RNA samples were analysed in 
Northern blot experiments to determine the level of expression of the transgene in each 
transformed line. The data shown in the column labeled "Northern" in Table 1 shows 
that the transformed plant lines containing the sense and anti-sense constructs for OMT 
all exhibited high levels of expression, relative to the background on the Northern blots. 

30 OMT expression in sense plant line number 2 was not measured because the RNA 
sample showed signs of degradation. There was no detectable hybridisation to RNA 
samples from empty vector-transformed control plants. 
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c) Modulation of OMT enzvme activity in transf ormed plants 

The total activity of OMT enzyme, encoded by the Pinus OMT gene and by the 
endogenous tobacco OMT gene, in transformed tobacco plants was analysed for each 
transformed plant line created with the OMT sense and anti-sense constructs. Crude 
protein extracts were prepared from each transformed plant and assayed using the 
method of Zhang et al. ( Plant PhvsioL 113:65-74, 1997). The data contained in the 
column labeled "Enzyme" in Table 1 shows that the transformed plant lines containing 
the OMT sense construct generally had elevated OMT enzyme activity, with a 
maximum of 199%, whereas the transformed plant lines containing the OMT anti-sense 
0 construct generally had reduced OMT enzyme activity, with a minimum of 35%, 
relative to empty vector-transformed control plants. OMT enzyme activity was not 
estimated in sense plant line number 3. 

d) Effects of Pinus OMT on lignin concentration in transfo rmed plants 

The concentration of lignin in the transformed tobacco plants was determined 
using the well-established procedure of thioglycolic acid extraction (see, Freudenberg 
et al. in "Constitution and Biosynthesis of Lignin", Springer- Verlag, Berlin, 1968). 
Briefly, whole tobacco plants, of an average age of 38 days, were frozen in liquid 
nitrogen and ground to a fine powder in a mortar and pestle. 100 mg of frozen powder 
) from one empty vector-transformed control plant line, the five independent transformed 
plant lines containing the sense construct for OMT and the eight independent 
transformed plant lines containing the anti-sense construct for OMT were extracted 
individually with methanol, followed by 10% thioglycolic acid and finally dissolved in 
1 M NaOH. The final extracts were assayed for absorbance at 280 nm. The data shown 
5 in the column labelled "TGA 11 in Table 1 shows that the transformed plant lines 
containing the sense and the anti-sense OMT gene constructs all exhibited significantly 
decreased levels of lignin, relative to the empty vector-transformed control plant lines. 
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20 

These data clearly indicate that lignin concentration, as measured by the TGA 
assay, can be directly manipulated by either sense or anti-sense expression of a lignin 
biosynthetic gene such as OMT. 



25 Example 4 

Use of a 4-Coumarate:CoA ligase (4CL) Gene to Modify Lignin Biosynthesis 

a) Transformation of tobacco plants with a Pinus radiata 4CL gene 
30 Sense and anti-sense constructs containing a sequence including the coding 

region of 4CL (SEQ ID NO: 56) from Pinus radiata were inserted into Agrobacterium 
tumefaciens LBA4301 by direct transformation as described above. The presence and 
integrity of the transgenic constructs were verified by restriction digestion and DNA 
sequencing. 

35 Tobacco (Nicotiana tabacum cv. Samsun) leaf sections were transformed as 

described above. Five independent transformed plant lines were established for the 
sense construct and eight independent transformed plant lines were established for the 
anti-sense construct for 4CL. Transformed plants containing the appropriate lignin 
gene construct were verified using Southern blot experiments. A in the column 



19 



WO 98/11205 



PCT/NZ97/00112 



labeled "Southern" in Table 2 indicates that the transformed plant lines listed were 
confirmed as independent transformed lines. 

b) Expression of Pinus 4CL in transformed plants 

5 Total RNA was isolated from each independent transformed plant line created 

with the 4CL sense and anti-sense constructs. The RNA samples were analysed in 
Northern blot experiments to determine the level of expression of the transgene in each 
transformed line. The data shown in the column labelled "Northern" in Table 2 below 
shows that the transformed plant lines containing the sense and anti-sense constructs for 

10 4CL all exhibit high levels of expression, relative to the background on the Northern 
blots. 4CL expression in anti-sense plant line number 1 was not measured because the 
RNA was not available at the time of the experiment. There was no detectable 
hybridisation to RNA samples from empty vector-transformed control plants. 

15 c) Modulation of 4CL enzvme activity in transformed plants 

The total activity of 4CL enzyme, encoded by the Pinus 4CL gene and by the 
endogenous tobacco 4CL gene, in transformed tobacco plants was analysed for each 
transformed plant line created with the 4CL sense and anti-sense constructs. Crude 
protein extracts were prepared from each transformed plant and assayed using the 

20 method of Zhang et al. ( Plant Physiol .. 113:65-74, 1997). The data contained in the 
column labeled "Enzyme" in Table 2 shows that the transformed plant lines containing 
the 4CL sense construct had elevated 4CL enzyme activity, with a maximum of 258%, 
and the transformed plant lines containing the 4CL anti-sense construct had reduced 
4CL enzyme activity, with a minimum of 59%, relative to empty vector-transformed 

25 control plants. 

d) Effects of Pinus 4CL on lignin concentration in transformed plants 

The concentration of lignin in samples of transformed plant material was 
determined as described in Example 3. The data shown in the column labelled "TGA" 
30 in Table 2 shows that the transformed plant lines containing the sense and the anti- 
sense 4CL gene constructs all exhibited significantly decreased levels of lignin. relative 
to the empty vector-transformed control plant lines. These data clearly indicate that 
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lignin concentration, as measured by the TGA assay, can be directly manipulated by 
either sense or anti-sense expression of a lignin biosynthetic gene such as 4CL. 
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Example 5 

Transformation of Tobacco using the Inventive Lignin Biosvnthetic Genes 

30 

Sense and anti-sense constructs containing sequences including the coding 
regions of C3H (SEQ ID NO: 18), F5H (SEQ ID NO: 19), CCR (SEQ ID NO: 25) and 
CGT (SEQ ID NO: 31) from Eucalyptus grandis y and PAL (SEQ ID NO: 45 and 47), 
C4H (SEQ ID NO: 48 and 49), PNL (SEQ ID NO: 81) and LAC (SEQ ID NO: 83) 

35 from Pirxus radiata were inserted into Agrobacterium tumefaciens LBA4301 by direct 
transformation as described above. The presence and integrity of the transgenic 
constructs were verified by restriction digestion and DNA sequencing. 

Tobacco (Nicotiana tabacum cv. Samsun) leaf sections were transformed as 
described in Example 3. Up to twelve independent transformed plant lines were 

40 established for each sense construct and each anti-sense construct listed in the 
preceding paragraph. Transformed plants containing the appropriate lignin gene 
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construct were verified using Southern blot experiments. All of the transformed plant 
lines analysed were confirmed as independent transformed lines. 

Example 6 

5 

Manipulation of Lignin Content in Transformed Plants 

a^ Determination of transgene expression bv Northern blot experiments 

Total RNA was isolated from each independent transformed plant line described in 

10 Example 5. The RNA samples were analysed in Northern blot experiments to 
determine the level of expression of the transgene in each transformed line. The 
column labelled "Northern'' in Table 3 shows the level of transgene expression for all 
plant lines assayed, relative to the background on the Northern blots. There was no 
detectable hybridisation to RNA samples from empty vector-transformed control 

15 plants. 

b) Determination of lignin concentration in transformed plants 

The concentration of lignin in empty vector-transformed control plant lines and in 
up to twelve independent transformed lines for each sense construct and each anti-sense 
20 construct described in Example 5 was determined as described in Example 3. The 
column labelled "TGA" in Table 3 shows the thioglycolic acid extractable lignins for 
all plant lines assayed, expressed as the average percentage of TGA extractable lignins 
in transformed plants versus control plants. The range of variation is shown in 
parentheses. 
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Table 3 
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Transformed plant lines containing the sense and the anti-sense lignin 
20 biosynthetic gene constructs all exhibited significantly decreased levels of lignin, 
relative to the empty vector-transformed control plant lines. The most dramatic effects 
on lignin concentration were seen in the F5H anti-sense plants with as little as 35% of 
the amount of lignin in control plants, and in the PAL anti-sense plants with as little as 
37% of the amount of lignin in control plants. These data clearly indicate that lignin 
25 concentration, as measured by the TGA assay, can be directly manipulated by 
conventional anti-sense methodology and also by sense over-expression using the 
inventive lignin biosynthetic genes. 

Example 7 

30 

Modulation of Lipnin Enzvme Activity in Transformed Plants 

The activities and substrate specificities of selected lignin biosynthetic enzymes 
were assayed in crude extracts from transformed tobacco plants containing sense and 
35 anti-sense constructs for PAL (SEQ ID NO: 45), PNL (SEQ ID NO: 81) and LAC 
(SEQ ID NO: 83) from Pinus radiata, and CGT (SEQ ID NO: 31) from Eucalyptus 
grandis. 

Enzyme assays were performed using published methods for PAL (Southerton. 
S.G. and Deverall, B.J., Plant Path . 39:223-230, 1990), CGT (Vellekoop, P. et aL 
40 FEBS . 330:36-40, 1993), PNL (Espin. C.J. et aL, Phvtochemistrv . 44:17-22, 1997) and 
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LAC (Bao, W. et al.. Science , ^60:672-674, 1993). The data shown in the column 
labelled "Enzyme" in Table 4 shows the average enzyme activity from replicate 
measures for all plant lines assayed, expressed as a percent of enzyme activity in empty 
vector-transformed control plants. The range of variation is shown in parentheses. 

5 

Table 4 
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transcene 


orientation 


no. of lines 


Enzvme 


control 


na 


3 


100 


PAL 


sense 


5 


87 (60-124) 


PAL 


anti-sense 


3 


53 (38-80) 


CGT 


anti-sense 


I 


89 


PNL 


anti-sense 


6 


144 (41-279) 


LAC 


sense 


5 


78 (16-240) 


LAC 


anti-sense 


11 


64 (14-106) 



All of the transformed plant lines, except the PNL anti-sense transformed plant 
20 lines, showed average lignin enzyme activities which were significantly lower than the 
activities observed in empty vector-transformed control plants. The most dramatic 
effects on lignin enzyme activities were seen in the PAL anti-sense transformed plant 
lines in which all of the lines showed reduced PAL activity and in the LAC anti-sense 
transformed plant lines which showed as little as 14% of the LAC activity in empty 
25 vector-transformed control plant lines. 

Example 8 

30 Functional Identification of Lignin Biosvnthetic Genes 

Sense constructs containing sequences including the coding regions for PAL 
(SEQ ID NO: 47), OMT (SEQ ID NO: 53), 4CL (SEQ ID NO: 56 and 57) and POX 
(SEQ ID NO: 86) from Pinus radiata, and OMT (SEQ ID NO: 23 and 24), CCR (SEQ 

35 ID NO: 26-28), CGT (SEQ ID NO: 31 and 33) and POX (SEQ ID NO: 42 and 44) from 
Eucalyptus grandis were inserted into the commercially available protein expression 
vector, pProEX-1 (Gibco BRJL). The resultant constructs were transformed into £. coli 
XL 1 -Blue (Stratagene), which were then induced to produce recombinant protein by the 
addition of IPTG. Purified proteins were produced for the Pinus OMT and 4CL 

40 constructs and the Eucalyptus OMT and POX constructs using Ni column 
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chromatography (Janknecht, R. et aL Proc. Natl. Acad. ScL 88:8972-8976, 1991). 
Enzyme assays for each of the purified proteins conclusively demonstrated the expected 
substrate specificity and enzymatic activity for the genes tested. 

The data for two representative enzyme assay experiments, demonstrating the 

5 verification of the enzymatic activity of a Pinus radiata 4CL gene (SEQ ID NO: 56) 
and a Pinus radiata OMT gene (SEQ ID NO: 53), are shown in Table 5. For the 4CL 
enzyme, one unit equals the quantity of protein required to convert the substrate into 
product at the rate of 0.1 absorbance units per minute. For the OMT enzyme, one unit 
equals the quantity of protein required to convert 1 pmole of substrate to product per 

10 minute. 

Table S 





purification 


total ml 


total mg 


total units 


% yield 


fold 


trans eene 


step 


extract 


Drotetn 


activitv 


activitv 


Durification 


4CL 


crude 


10 ml 


51 mg 


4200 


100 


1 




Ni column 


4 ml 


0.84 mg 


3680 


88 


53 


OMT 


crude 


10 ml 


74 mg 


4600 


100 


1 




Ni column 


4 ml 


1.2 mg 


4487 


98 


60 



25 The data shown in Table 5 indicate that both the purified 4CL enzyme and the 

purified OMT enzyme show high activity in enzyme assays, confirming the 
identification of the 4CL and OMT genes described in this application. Crude protein 
preparations from E. colt transformed with empty vector show no activity in either the 
4CL or the OMT enzyme assay. 

30 Although the present invention has been described in some detail by way of 

illustration and example for purposes of clarity of understanding, changes and 
modifications can be carried out without departing from the scope of the invention 
which is intended to be limited only by the scope of the appended claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 
(i) APPLICANT: Genesis Researcr, and Development Corp. Ltd. 

(ii) TITLE OF THE INVENTION: MATERIALS AND METHODS FOR 
THE MODIFICATION OF PLANT LIGNIN CONTENT 

Ciii) NUMBER OF SEQUENCES: 88 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Russel! McVeagh West-Walker 

(B) STREET: The Todd Building, Cnr Brandon Street & 

Lambton Quay 

(C) CITY: Wellington 

(D) STATE: 

(E) COUNTRY: New Zealand 

(F) ZIP: 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: Wordperfect 5.1 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vil) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Bennett, Michael Roy 

(B) REGISTRATION NUMBER: 

(C) REFERENCE/DOCKET NUMBER: 22315XMRB 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: +64 4 495 7740 

(B) TELEFAX: +64 4 499 9306 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CTTCGCGCTA CCGCATACTC CACCACCGCG TGCAGAAGAT GAGCTCGGAG GGTGGGAAGG 
60 

AGGATTGCCT CGGTTGGGCT GCCCGGGACC CTTCTGGGTT CCTCTCCCCN TACAAATTCA 
120 

CCCGCAGGCC GTGGGAAGCG AAGACGTCTC GATTAAGATC ACGCACTGTG GAGTGTGCTA 
180 

CGCAGATGTG GCTTGGACTA GGAATGTGCA GGGACACTCC AAGTATCCTC TGGTGCCGGG 
240 
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GCACGAGATA GTTGGAATTG TGAAACAGGT TGGCTCCAGT GTCCAACGCT 
300 

CGATCATGTG GGGGTGGGAA CTTATGTCAA TTCATGCAGA GAGTGCGAGT 
360 

CAGGCTAGAA GTCCAATGTG AAAAGTCGGT TATGACTTTT GATGGAATTG 
420 

TACAGTGACA AAGGGAGGAT ATTCTAGTCA CATTGTCGTC CATGAAAGGT 
480 

GATTCCAGAA AACTACCCGA TGGATCTAGC AGCGCATTGC TCTGTGCTGG 
535 



(2) INFORMATION FOR SEQ ID NO : 2 : 

{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xii SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

GCGCCTGCAG GTCGACACTA GTGGATCCAA AGAATTCGGC ACGAGGTTGC AGGTCGGGGA 
60 

TGATTTGAAT CACAGAAACC TCAGCGATTT TGCCAAGAAA TATGGCAAAA TCTTTCTGC™ 
120 

CAAGATGGGC CAGAGGAATC TTGTGGTAGT TTCATCTCCC GATCTCGCCA AGGAGGTCCT 
180 

GCACACCCAG GGCGTCGAGT TTGGGTCTCG AACCCGGAAC GTGGTGTTCG ATATCTTCAC 
240 

GGGCAAGGGG CAGGACATGG TGTTCACCGT CTATGGAGAT CACTGGAGAA AGATGCGCAG 
300 

GATCATGACT GTGCCTTTCT TTACGAATAA AGTTGTCCAG CACTACAGAT TCGCGTGGGA 
360 

AGACGAGATC AGCCGCGTGG TCGCGGATGT GAAATCCCGC GCCGAGTCTT CCACCTCGGG 
420 

CATTGTCATC CGTAGCGCCT CCAGCTCATG ATGTATAATA TTATGTATAG GATGATGTTC 
480 

GACAGGAGAT TCGAATCCGA GGACGACCCG CTTTTCCTCA AGCTCAAGGC CCTCAACGGA 
540 

GAGCGAAGTC GATTGGCCCA GAGCTTTGAG TACAATTATG GGGATTTCAT TCCCAGTCTT 
60C 

AGGCCCT7CC TCAGAGGTTA TCACAGAATC TGCAATGAGA TTAAAGAGAA ACGGCTCTCT 
660 

CTTTTCAAGG A 
671 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 940 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

CTTCAGGACA AGGGAGAGAT CAATGAGGAT AATGTTTTGT ACATCGTTGA 
60 

GTTGCAGCAA TTGAGACAAC GCTGTGGTCG ATGGAATGGG GAATAGCGGA 
120 

CACCAGGACA TTCAGAGCAA GGTGCGCGCA GAGCTGGACG CTGTTCTTGG 
180 

CAGATAACGG AACCAGACAC GACAAGGTTG CCCTACCTTC AGGCGGTTGT 
240 



TCAAAGTTGG 
ATTGCAATGA 
ATGCAGATGG 
ATTGCGTCAG 
ATCAC 



GAACATCAAC 
GCTGGTGAAC 
ACCAGGCGTG 
GAAGGAAACC 
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CTTCGTCTCC GCATGGCGAT CCCGTTGCTC GTCCCCCACA 
300 

CTCGGGGGCT ACGATATTCC GGCAGAGAGC AAGATCCTGG 
360 

AACAACCCCG CCAACTGGAA GAACCCCGAG GAGTTCCGCC 
420 

GAGAAGCACA CCGAAGCCAA TGGCAACGAC TTCAAATTCC 
480 

AGGAGCTGCC CGGGAATCAT TCTGGCGCTG CTCTCCTCGC 
540 

TTCAGAACTT CCACCTTCTG CCGCCGCCCG GGCAGAGCAA 
600 

GCGGGCAATT CAGCCTTCAC ATTCTCAACC ATTCTCTCAT 
660 

CTGCTTAATC CCAACTTGTC AGTGACTGGT ATATAAATGC 
720 

CTCCATCTAT CATGACTGTG TGTGCGTGTC CACTGTCGAG 
780 

CTTCAAAAGT TTGCTAGGAT TTCAATAACA GACACCGTCA 
840 

AAGTTTGCAT AAATTAAATG ATATTTCAAT ATACTATTTT 
900 

ATTTTACTGC TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
940 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 949 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(Di TOPOLOGY: linear 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

NNGCTCNACC GACGGTGGAC GGTCCGCTAC TCAGTAACTG AGTGGGATCC CCCGGGCTGA 
60 

CAGGCAATTC GATTTAGCTC ACTCATTAGG CACCCCAGGC TTTACACTTT ATGCTTCCGG 
120 

CTCGTATGTT GTGTGGAATT GTGAGCGGAT AACAATTTCA CACAGGAAAC AGCTATGACC 
180 

ATGATTACGC CAAGCGCGCA ATTAACCCTC ACTAAAGGGA ACAAAAGCTG GAGCTCCACC 
24C 

GCGGTGGCGG CCGCTCTAGA ACTAGTGGAT CCAAAGAAT? CGGCACGAGA CCCAGTGACC 
300 

TTCAGGCCTG AGAGATTTCT TGAGGAAGAT GTTGATATTA AGGGCCATGA TTACAGGCTA 
360 

CTGCCATTGG TGCAGGGCGC AGGATCTGCC CTGGTGCACA ATTGGGTATT AATTTAG7TC 
420 

AGTCTATGTT GGGACACCTG CTTCATCATT TCGTATGGGC ACCTCCTGAG GGAATGAAGG 
480 

CAGAAGACAT AGATCTCACA GAGAATCCAG GGCTTGTTAC TTTCATGGCC AAGCCTGTGC 
540 

AGGCCATTGC TATTCCTCGA TTGCCTGATC ATCTCTACAA GCGACAGCCA CTCAATTGAT 
600 

CAATTGATCT GATAGTAAGT TTGAATTTTG TTTTGATACA AAACGAAATA ACGTGCAGTT 
660 

TCTCCTTTTC CATAGTCAAC ATGCAGCTTT CTTTCTCTGA AGCGCATGCA GCTTTCTTTC 
720 

TCTGAAGCCC AACTTCTAGC AAGCAATAAC TGTATATTTT AGAACAAATA CCTATTCCTC 
780 

AAATTGAGWA TTTCTCTGTA GGGGNNGNTA ATTGTGCAAT TTGCAAGNAA TAGTAAAGTT 
840 

TANTTTAGGG NATTTTAATA GTCCTANGTA ANANGNGGNA ATGNTAGNGG GCATTNAGAA 
900 
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TGAATCTCCA 
TGAACGCCTG 
CCGAGCGGTT 
TGNCCTTCGG 
ACTCTCCATC 
AGTGGATGTC 
CGTCGCCAAG 
GCCCACCTGA 
TCTACTAAGA 
ATTATGTCAT 
GACTCTCCAC 



CGACGCCAAG 
GTGGTTGGCC 
CTTCGAGGAG 
TGTGGGGAGG 
GGAAGACTTG 
ACTGAGAAGG 
CCCATAGCTT 
ACAAAAAACA 
GCTCATAGCA 
G77TCAA7AA 
CAAT7GGGGA 
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ANCCCTAATA GNTGTTGGNG GNNGNTAGGN TTTTTNACCA AAAAAAAAA 
949 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) TYPE: nucleic acid 
;C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO ; 5 : 

GAATTCGGCA CGAGAAAGCC CTAGAATTTT TTCAGCATGC 
60 

CTTTAACTGC AATAACTGTG GAAGCGTACA AAAAGTTTGT 
120 

CTGGTCAGGT TCCAGCATTT CCAAAATACA CACCTGCTGT 
18C 

CTTGCACTCA GCCCTACATT GATTTAGCAA ACAACTACAG 
240 

TGGAAGCTTG TGTCAACACG AACACAGAGA AGTTCAAGAA 
300 

TCAAGCAAGT TTTGTCATCT CTTTATAAAC GGAATATTCA 
360 

TGACCCTCTC TCTTCAAGAC ATAGCAAGTA CGGTACAGTT 
420 

AACTCCATGT TCTGCAGATG ATTCAAGATG GTGAGATTTT 
430 

ATGGGATGGT GAGCTTCAAT GAGGATCCTG AACAGTACAA 
540 

ATATAGATAC TGCAATTCGG AGAATCATGG CACTATCAAA 
600 

AGCAGATTTC GTGTGATCAT TCCTACCTGA GTAAGGTGGG 
660 

ACATAGATGA TTTTGATACT GTTCCCCAGA AGTTCANAAA 
720 

TCATCTTCAA GACTCGCTTA TATTCATTAC TTTCTATGTG 
780 

TAGTACTGTG GCTGAGTCCA GAAAGGATCT CTCGGTATTA 
840 

AAAATCTCAA ATTTCTCGAT GTCTAGTCTT GATTTTGATT 
900 

TGACATTTGA GCACCTCGAG TGAACTACAA AGTTGCATGT 
959 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1026 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

GAATTCGGCA CGAGCTTTGA GGCAACCTAC ATTCATTGAA TCCCAGGATT TCTTCTTGTC 
60 

CAAACAGGTT TAAGGAAATG GCAGGCACAA GTGTTGCTGC AGCAGAGGTG AAGGCTCAGA 
120 

CAACCCAAGC AGAGGAGCCG GTTAAGGTTG TCCGCCATCA AGAAGTGGGA CACAAAAGTC 
180 

TTTTGCAGAG CGATGCCCTC TATCAGTATA TATTGGAAAC GAGCGTGTAC CCTCGTGAGC 
240 
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TATCACAGCC 
CCTAGTTTCT 
TGTCCAAAGA 
TAGTGGGAAA 
TGATAGTAAT 
GAGATTGACA 
GGAGACTGCT 
TGCAACCATA 
AACATGTCAG 
GAAGCTCACC 
GAGAGAGCGT 
TATGTAACAA 
AATTGAT AGT 
TCACTTGACA 
ATGAATGCGA 
TAAAAAAAAA 



CCAGCGACAA 
CTCATTCAGA 
AATTTGAAAT 
ATTTCTGTAT 
TTGGGGTTAG 
CAGACATATC 
AAGCAGGCTG 
AATCAGAAAG 
ATGACTGAAT 
ACAGTAGATG 
TCAAGATTTG 
ATGATGTAAA 
CTGTTAACAA 
TGCCATCAAA 
CTTTTAGTTG 
AAAAAAAAA 
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CCGAGCCAAT GAAGGAGCTC CGCGAAGTGA CTGCCAAGCA TCCCTGGAAC CTCATGACTA 
300 

CTTCTGCCGA TGAGGGTCAA TTTCTGGGCC TCCTGCTGAA GCTCATTAAC GCCAAGAACA 
360 

CCATGGAGAT TGGGGTGTAC ACTGGTTACT CGCTTCTCAG CACAGCCCTT GCATTGCCCo 
420 

ATGATGGAAA GATTCTAGCC ATGGACATCA AC AG AG AG AA CTATGATATC GGATTGCCTA 
480 

TTATTGAGAA AGCAGGAGTT GCCCACAAGA TTGACTTCAG AGAGGGCCCT GCTCTGCCAG 
540 

TTCTGGACGA ACTGCTTAAG AATGAGGACA TGCATGGATC GTTCGATTTT GTGTTCGTGG 
600 

ATGCGGACAA AGACAACTAT CTAAACTACC ACAAGCGTCT GATCGATCTG GTGAAGGTTG 
660 

GAGGTCTGAT TGCATATGAC AACACCCTGT GGAACGGA7C TG7GGTGGCT CCACCCGATG 
720 

CTCCCCTGAG GAAATATGTG AG AT AT TAG A GAGATTTCGT GATGGAGCTA AACAAGGCCC 
780 

TTGCTGTCGA TCCCCGCATT GAGATCAGCC AAATCCCAGT CGGTGACGGC GTCACCCTTT 
840 

GCAGGCGTGT CTATTGAAAA CAATCCTTGT TTCTGCTCGT CTATTGCAAG CATAAAGGCT 
900 

CTCTGATTAT AAGGAGAACG CTATAATATA TGGGGTTGAA GCCATTTGTT TTGTTTAGTG 
960 

TATTGATAAT AAAGTAGTAC AGCATATGCA AAGTTTGTAT CAAAAAAAAA AAAAAAAAAA 
1020 

AAAAAA 
1026 

(2} INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1454 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: Linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

GAATTCGGCA CGAGGCCAAC TGCAAGCAAT ACAGTACAAG AGCCAGACGA TCGAATCCTG 
60 

TGAAGTGGTT CTGAAGTGAT GGGAAGCTTG GAATCTGAAA AAACTGTTAC AGGATATGCA 
120 

GCTCGGGACT CCAGTGGCCA CTTGTCCCCT TACACTTACA ATCTCAGAAA GAAAGGACCT 
180 

GAGGATGTAA TTGTAAAGGT CATTTACTGC GGAATCTGCC ACTCTGATTT AGTTCAAATG 
240 

CGTAATGAAA TGGACATGTC TCATTACCCA ATGGTCCCTG GGCATGAAGT GGTGGGGATT 
300 

GTAACAGAGA TTGGCAGCGA GGTGAAGAAA TTCAAAGTGG GAGAGCATGT AGGGGTTGGT 
360 

TGCATTGTTG GGTCCTGTCG CAGTTGCGGT AATTGCAATC AGAGCATGGA ACAATACTGC 
420 

AGCAAGAGGA 7TTGGACCTA CAATGATGTG AACCATGACG GCACACCTAC TCAGGGCGGA 
480 

TTTGCAAGCA GTATGGTGGT TGATCAGATG TWTGTGGTTC GAATCCCGGA GAATCTTCCT 
540 

CTGGAACAAG CGGCCCCTCT GTTATGTGCA GGGGTTACAG TTTTCAGCCC AATGAAGCAT 
600 

TTCGCCATGA CAGAGCCCGG GAAGAAATGT GGGATTTTGG GTTTAGGAGG CGTGGGGCAC 
660 

ATGGGTGTCA AGATTGCCAA AGCCTTTGGA CTCCACGTGA CGGTTATCAG TTCGTCTGAT 
720 

AAAAAGAAAG AAGAAGCCAT GGAAGTCCTC GGCGCCGATG CTTATCTTGT TAGCAAGGAT 
780 
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ACTGAAAAGA TGATGGAAGC AGCAGAGAGC CTAGATTACA TAATGGACAC CATTCCAGTT 
840 

GCTCATCCTC TGGAACCATA TCTTGCCCTT CTGAAGACAA ATGGAAAGCT AGTGATGCTG 
900 

GGCGTTGTTC CAGAGTCGTT GCACTTCGTG ACTCCTCTCT TAATACTTGG GAGAAGGAGC 
960 

ATAGCTGGAA GTTTCATTGG CAGCATGGAG GAAACACAGG AAACTCTAGA TTTCTGTGCA 
1020 

GAGAAGAAGG TATCATCGAT GATTGAGGTT GTGGGCCTGG ACTACATCAA CACGGCCATG 
1080 

GAAAGGTTGG AGAAGAACGA TGTCCGTTAC AGATTTGTGG TGGATGTTGC TAGAAGCAAG 
1140 

TTGGATAATT AGTCTGCAAT CAATCAATCA GATCAATGCC TGCATGCAAG ATGAATAGAT 
1200 

CTGGACTAGT AGCTTAACAT GAAAGGGAAA TTAAATTTTT ATTTAGGAAC TCGATACTGG 
1260 

TTTTTGTTAC TTTAGTTTAG CTTTTGTGAG GTTGAAACAA TTCAGATGTT TTTTTAACTT 
1320 

GTATATGTAA AGATCAATTT CTCGTGACAG TAAATAATAA TCCAATGTCT TCTGCGAAAT 
1390 

TAATATATGT ATTCGTATTT TTATATGAAA AAAAAAAAAA AAAA 
1 4 54 AAAAAA AAAAAAAAAA 14 40 AAAAAAAAAA AAAA 
14 54 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 
(A} LENGTH: 740 base pairs 
(B) TYPE: nucleic acid 
(CJ STRANDEDNESS: single 
(D) TOPOLOGY: linear 

Ui} SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GAATTCGGCA CGAGACCATT TCCAGCTAAT ATTGGCATAG CAATTGGTCA TTCTATCTTT 
60 

GTCAAAGGAG ATCAAACAAA TTTTGAAATT GGACCTAATG GTGTGGAGGC TAGTCAGCTA 
120 

TACCCAGATG TGAAATATAC CACTGTCGAT GAGTACCTCA GCAAATTTGT GTGAAGTATG 
180 

CGAGATTCTC TTCCACATGC TTCAGAGATA CATAACAGTT TCAATCAATG TTTGTCCTAG 
240 

GCATTTGCCA AATTGTGGGT TATAATCCTT CGTAGGTGTT TGGCAGAACA GAACCTCCTG 
300 

TTTAGTATAG TATGACGAGC TAGGCACTGC AGATCCTTCA CACTTTTCTC TTCCATAAGA 
360 

AACAAATACT CACCTGTGGT TTGTTTTCTT TCTTTCTGGA ACTTTGGTAT GGCAATAATG 
420 

TCTTTGGAAA CCGCTTAGTG TGGAATGCTA AGTACTAGTG TCCAGAGTTC TAAGGGAGTT 
480 

CCAAAATCAT GGCTGATGTG AACTGGTTGT TCCAGAGGGT GTTTACAACC AACAGTTGTT 
540 

CAGTGAATAA TTTTGTTAGA GTGTTTAGAT CCATCTTTAC AAGGCTATTG AGTAAGGTTG 
600 

GTGTTAGTGA ACGGAATGAT GTCAAATCTT GATGGGCTGA CTGACTCTCT TGTGATGTCA 
660 

AATCTTGATG GATTGTGTCT TTTTCAATGG TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
740 720 AAAAAAAAAA AAAAAAAAAA 
7 40 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 624 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xx) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

GAATTCCTGC AGCCCGGGGG ATCCACTAGT TCTAGAGCGG CCGCCACCGC GGTGGAGCTC 
60 

GCGCGCCTGC AGGTCGACAC TAGTGGATCC AAAGAATTCG GCACGAGGCC CGACGGCCAC 
120 

TTGTTGGACG CCATGGAAGC TCTCCGGAAA GCCGGGATTC TGGAACCGTT TAAACTGCAG 
180 

CCCAAGGAAG GACTGGCTCT CGTCAACGGC ACAGCGGTGG GATCCGCCGT GGCCGCGTCC 
240 

GTCTGTGTTG ACGCCAACGT GCTGGGCGTG CTGGCTGAGA TTCTGTCTGC GCTCTTCTGC 
300 

GAGGTGATGC AAGGGAAACC GGAGTTCGTA GATCCGTTAA CCCACCAGTT GAAGCACCAC 
360 

CCAGGGCAGA TCGAAGCCGC GGCCGTCATG GAGTTCCTCC TCGACGGTAG CGACTACGTG 
420 

AAAGAAGCAG CGCGGCTTCA CGAGAAAGAC CCGTTGAGCA AACCGAAACA AGACCGCTAC 
480 

GCTCTGCGAA CATCGCCACA GTGGTTGGGG CCTCCGATCG AAGTCATCCG CGCTGCYACT 
540 

CACTCCATCG AGCGGGAGAT CAATTCCGTC AACGACAATC CGTTAATCGA TGTCTCCAGG 
600 

GACATGGCTG TCCACGGCGG CAAC 
624 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 279 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GAATTCCTGC AGCCCGGGGG ATCCACTAGT TCTAGAGCGG CCGCCACCGC GGTGGAGCTC 
60 

CAGTACCTGG CCAACCCCGT CACGACTCAC GTCCAGAGCG CCGAACAACA CAACCAGGAT 
120 

GTCAATTCCC TCGGCTTGAT CTCCGCCAGA AAGACTGCCG AGGCCGTTGA GATTTTAAAG 
180 

CTGATGTTCG CTACATATCT GGTGGCCTTA TGCCAGGCGA TCGATCTCCG GCACCTGGAA 
240 

GAAAACATGC GATCCGTTGT GAAGCACGTA GTCTTGCA 
278 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 765 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

GAGCTCCTGC AAGTCATCGA TCATCAGCCC GTTTTCTCGT ACATCGACGA TCCCACAAAT 
60 

C CAT CAT AC G CGCTTATGCT CCAACTCAGA GAAGTGCTCG TAGATGAGGC TCTCAAATCA 
120 
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TCTTGCCCAG ACGGGAATGA CGAATCCGAT CACAATTTGC AGCCCGCTGA GAGCGCTGGA 
180 

GCTGCTGGAA TATTACCCAA TTGGGTGTTT AGCAGGATCC CCATATTTCA AGAGGAGTTG 
240 

AAGGCCCGTT TAGAGGAAGA GGTTCCGAAG GCGAGGGAAC GATTCGATAA TGGGGACTTC 
300 

CCAATTGCAA ACAGAATAAA CAAGTGCAGG ACATATCCCA TTTACAGATT CGTGAGATCA 
360 

GAGTTGGGAA CCGATTTGCT AACAGGGCCC AAGTGGAGAA GCCCCGGCGA AGATATAGAA 
420 

AAGGTATTTG AGGGCATTTG CCAAGGGAAA ATTGGAAACG TGATCCTCAA ATGTCTGGAC 
480 

GCTTGGGGTG GGTGCGCTGG ACCATTCACT CCACGTGCAT ATCCTGCGTC TCCTGCAGCG 
540 

TTCAATGCCT CATATTGGGC ATGGTTTGAT AGCACCAAAT CACCCTCTGC AACGAGCGGC 
600 

AGAGGTTTCT GGAGCGCCCA ACAACAACAA GTTCTTTGAT TTAACTGACT CTTAAGCATT 
660 

CCTAAACAGC TTGTTCTTCG CAATAACGAA TCTTTCATCT TCGTTACTTT GTAAAAGATG 
720 

GGGTTCCAAC AAAATAGAAG AAATATTTTC GATCCAAAAA AAAAA 
765 

(2) INFORMATION FOR SEQ ID NO: 12: 

;d sequence: CHARACTERISTICS: 

(A) LENGTH: 453 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

TGATTATGCG GATCCTTGGG CAGGGATACG GCATGACAGA AGCAGGCCCG GTGCTGGCAA 
60 

TGAACCTAGC CTTCGCAAAG AATCCTTTCC CCGCCAAATC TGGCTCCTGC GGAACAGTCG 
120 

TCCGGAACGC TCAAATAAAG ATCCTCGATT ACAGGAACTG GCGAGTCTCT CCCGCACAAT 
180 

CAAGCCGGCG AAATCTGCAT CCGCGGACCC GAAATAATGA AAGGATATAT TAACGACCCG 
240 

GAATCCACGG CCGCTACAAT CGATGAAGAA GGCTGGCTCC ACACAGGCGA CGTCGGGTAC 
300 

ATTGACGATG ACGAAGAAAT CTTCATAGTC GACAGAGTAA AGGAGATTAT CAATATAAAG 
360 

GCTTCCAGGT GGATCCTGCT AATCGAATTC CTGCAGCCCG GGGGTCCACT AGTTCTAGAG 
420 

CGGCCGCCAC CGCGGTGGAG CTCCAGCTTT TGT 
453 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 278 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TCTTCGAATT CTCTTTCACG ACTGCTTCGT TAATGGCTGC GATGGCTCGA TATTGTTAGA 
60 

TGATAACTCA ACGTTCACCG GAGAAAAGAC TGCAGGCCCA AATGTTAATT CTGCGAGAGG 
120 
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ATTCGACGTA ATAGACACCA TCAAAACTCA AGTTGAGGCA GCCTGCAGTG GTGTCGTGTC 
180 

AGTTGCCGAC ATTCTCGCCA TTGCTGCACG CGATTCAGTC GTCCAACTGG GGGGCCCAAC 
240 

ATGGACGGTA CTTCTGGGAG AAAAGACGGA TCCGATCA 
278 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 



(xi; SEQUENCE DESCRIPTION: SEQ ID NO: 14 

CTTCGAATTC WYTTYCAYGA YTG 
23 

■2i INFORMATION FOR SEQ ID NO: 15: 

U) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
( 3 ) TYPE: nucleic acid 

(C) STRANDEDNESS: smgie 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GATCGGATCC RTCYYKYCTY CC 
22 

(2) INFORMATION FOR SEQ ID NO : 1 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 base pairs 

(B) TYPE: nucleic acid 
(Ci STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AATTCGGCAC GAGACGACCT CTTGTATCGG ACCCGGATCC GCTATCGTTA ACGTACACAC 
60 

GTTCTAGTGC TGAATGGAGA TGGAGAGCAC CACCGGCACC GGCAACGGCC TTCACAGCCT 
120 

CTGCGCCGCC GGGAGCCACC ATGCCGACCC ACTGAACTGG GGGGCGGCGG CAGCAGCCCT 
180 

CACAGGGAGC CACCTCGACG AGGTGAAGCG GATGGTCGAG GAGTACCGGA GGCCGGCGGT 
240 

GCGCCTCGGC GGGGAGTCCC TCACGATAGC CCAGGTGGCG GCGGTGGCGA GTCAGGAGGG 
300 

GGTAGGGGTC GAGCTCTCGG AGGCGGCCCG TCCCAGGGTC AAGGCCAGCA GCGACTGGGT 
360 

CATGGAGAGC ATGAACAAGG GAACTGACAG CTACGGGGTC ACCACCGGGT TCGGCGGCAA 
420 

CTTCTCAAAC CGGAGGCCGA AGCAAGGCGG TCCTTTTCAG AAGGAACTTA TA 
472 

(2) INFORMATION FOR SEQ ID NO: 17: 
ii) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 

CCAAAGCTCC TAGTGCCTCA TGAGTCTGCT 
60 

CCCCGAGGCA CCATGATCCT GGTTAATGCG 
120 

GACGATCCCA CAAATTTTAA ACCGGAGAGG 
180 

CGACTATTGC CGTTTGGGAT GGGGAGGAGA 
240 

GTGGTGAGCT TGGTCCTGGC GGCGCTTATT 
300 

GAATTGGTGG ACTTGTCCGA GGGGACGGGA 
360 

GCCTTGTGCA AAGCGCGTGA ATGCATGATA 
420 

CGTTGTCTAA TGAATTTACA TTGGTGATGT 
490 

CTGAAAATAG GCCAGTGCAG CTTTAGGAAT 
540 

GCCAATGCAG CTTTAGGCCT TTCTCTTAGG 
600 

AACATTGTTC AAAAAAAAAA AA 
622 

(2) INFORMATION FOR SEQ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CACGCTCGAC GAATTCGGTA CCCCGGGTTC GAAATCGATA AGCTTGGATC CAAAGCAACA 
60 

CATTGAACTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCCCCCACCC CCCCTTCCCA 
120 

ACCCCACCCA CATACAGACA AGTAGATACG CGCACACAGA AGAAGAAAAG ATGGGGGTTT 
180 

CAATGCAGTC AATCGCACTA GCGACGGTTC TGGCCGTCCT AACGACATGG GCGTGGAGGG 
240 

CGGTGAACTG GGTGTGGCTG AGGCCGAAGA GGCTCGAGAG GCTTCTGAGA CAGCAAGGTC 
300 

TCTCCGGCAA GTCCTACACC TTCCTGGTCG GCGACCTCAA GGAGAACCTG CGGATGCTCA 
360 

AGGAAGCCAA GTCCAAGCCC ATCGCCGTCT CCGATGACAT CAAGCCTCGT CTCT 
414 

(2) INFORMATION FOR SEQ ID NO : 1 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 469 base pairs 

(B) TYPE: nucleic acid 
tC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



SEQ ID NO: 17 : 

GAGGATTGCA CAATTGGCGG GTTCGACGTG 
TGGGCAATTC AAAG AG AC CC AAAAGTGTGG 
TACGAGGGAT TGGAAGGTGA TCATGCCTAC 
AGTTGTCCTG GTGCTGGCCT TGCCAATAGA 
CAGTGCTTCG AATGGGAACG AGTTGGCGAA 
CTCACAATGC CAAAGAGAGA GCCATTGGAG 
GCTAATGTTC TTGCGCACCT TTAAGAAGGT 
ATCTCCAATG TTTTTGAATA ATCAAATAGA 
GATCGTGAGC ATCAATAGCA TCCTGAGGAG 
AGAAAAATGA TGGTTTATAT AGGTACTGGC 

ID NO: 18 : 
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(xi) SEQUENCE DESCRIPTION: SEQ I D NO : 1 "? : 

GAATTCGGCA CGAGTGTCTC TCTCTCTCTC TCTCTCTGTA AACCACCATG CTCTTCCTCA 
6C 

CTCATCTCCT AGCAGTTCTA GGGGTTGTGT TGCTCCTGCT AATTCTATGG AGGGCAAGAT 
120 

CTTCTCCGAA CAAACCCAAA GGTACTGCCT TACCCCCGGA GCTGCCGGGC GCATGGCCGA 
180 

TCATAGGCCA CATCCACTTG CTGGGCGGCG AGACCCCGCT GGCCAGGACC CTGGCCGCCA 
240 

TGGCGGACAA GCAGGGCCCG ATGTTTCGGA TCCGTCTCGG AGTCCACCCG GCGACCATCA 
300 

TAAGCAGCCG TGAGGCGGTC CGGGAGTGC7 TCACCACCCA CGACAAGGAC CTCGC7TCTC 
360 

GCCCCAAATC CAAGGCGGGA ATCCACTTGG GCTACGGGTA TGCCGGTTTT GGCTTCGTAG 
420 

AATACGGGCA CTTTTGGCGC GAGATGAGGA AGATCACCAT GCTCGAGCT 
4 69 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
[ A) LENGTH: 34 1 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDMESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CGGGCTCGTG GCTCGGCTCC GGCGCAACGC CCTTCCCACC GGGCCCGAGG GGCCTCCCGG 
60 

TCATCGGGAA CATGCTCATG ATGGGCGAGC TCACCCACCG CGGCCTCGCG AGTCTGGCGA 
120 

AGAAGTATGG CGGGATCTTC CACCTCCGCA TGGGCTTCCT GCACATGGTT GCCGTGTCGT 
180 

CCCCCGACGT GGCCCGCCAG GTCCTCCAGG TCCACGACGG GATCTTCTCG AACCGGCCTG 
240 

CCACCATCGC GATCAGCTAC CTCACGTATG ACCGGGCCGA CATGGCCTTC GCGCACTACG 
300 

GCCCGTTCTG GCGGCAGATG CGGAAGCTGT GCGTGATGAA A 
34 1 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 387 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

GAATTCGGCA CGAGCGGGCT CGTGGCTCGG CTCCGGCGCA ACGCCCTTCC CACCGGGCCC 
60 

GAGGGGCCTC CCGGTCATCG GGAACATGCT CATGATGGGC GAGCTCACCC ACCGCGGCCT 
120 

CGCGAGTCTG GCGAAGAAGT ATGGCGGGAT CTTCCACCTC CGCATGGGCT TCCTGCACAT 
180 

GGTTGCCGTG TCGTCCCCCG ACGTGGCCCG CCAGGTCCTC CAGGTCCACG ACGGGATCTT 
240 

CTCGAACCGG CCTGCCACCA TCGCGATCAG CTACCTCACG TATGACCGGG CCGACATGGC 
300 

CTTCGCGCAC TACGGCCCGT TCTGGCGGCA GATGCGGAAG CTGTGCGTGA TGAAAGCTCT 
360 
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TCAGCGGAAG CGGGCTGAGT CGTGGGA 
387 

(2) INFORMATION FOR SEQ ID NO: 22: 

iii SEQUENCE CHARACTERISTICS: 
•A) LENGTH: 443 base pairs 
(3} TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 2 : 

CACGAGCTCG TGAGCCTTCC CGGAGACAAG GCCATCTTAC TTCGCAACAA ATTGCGTCCG 
60 

CACTCCTTTC TCAAGAAACC TAGTCATCCA AGAAGCAGAG CATTGCAACT GCAAACAGCC 
120 

AAAGCCCAAA CTCGTACAGA AGGAGAGAGA GAGAGAGAAT AGAA.GCATGA GTGCATGCAC 
180 

GAACCAAGCA ATCACGACGG CCAGTGAAGA TGAAGAGTTC TTGTTCGCCA TGGAAATGAA 
240 

TGCTCTGATA GCACTCCCCT TGGTCTTGAA GGCCACCATC GAACTGGGGA TCCTCGAAAT 
300 

ACTGGCCGAG TGCGGGCCTA TGGCTCCACT TTCGCCTGCT CAGATTGCCT CCCGTCTCTC 
360 

CGCAAAGAAC CCGGAAGCCC CCGTAACCCT TGACCGGATC CTCCGGTTTC TCGCCAGCTA 
420 

CTCCATCCTC TCTTGCACTC TCG 
443 

(2) INFORMATION FOR SEQ ID NO : 2 3 : 

ii) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 607 base pairs 
(3) TYPE: nucleic acid 
(CJ STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi j SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GAATTCGGCA CGAGCCAACC CTGGACCAGG TACTTTTGGC AGGCGGTCCA TTGCCCTTCA 
60 

AACCGGTCCA AACCGGACCA TCACTGTCCT TATATACGTT GCATCATGCC TGCTCATAGA 
120 

ACTTAGGTCA ACTGCAACAT TTCTTGATCA CAACATATTA CAATATTCCT AAGCAGAGAG 
180 

AGAGAGAGAG AGAGAGAGAG AGAGAGAGAG AGAGTTTGAA TCAATGGCCA CCGCCGGAGA 
240 

GGAGAGCCAG ACCCAAGCCG GGAGGCACCA GGAGGTTGGC CACAAGTCTC TCCTTCAGAG 
300 

TGATGCTCTT TACCAATATA TTTTGGAGAC CAGCGTGTAC CCAAGAGAGC CTGAGCCCAT 
360 

GAAGGAGCTC AGGGAAATAA CAGCAAAACA TCCATGGAAC ATAATGACAA CATCAGCAGA 
420 

CGAAGGGCAG TTCTTGAACA TGCTTCTCAA GCTCATCAAA GCCAAGAACA CCATGGAGAT 
480 

TGGTGTCTTC ACTGGCTACT CTCTCCTCGC CACCGCTCTT GCTCTTCCTG AT G AC GG AAA 
540 

GATTTTGGCT ATGGACATTA ACAGAGAGAG CTATGAACTT GGCCTGCCGG CATCCAAAAA 
600 

GCCGGTG 
6C7 

(2) INFORMATION FOR SEQ ID NO : 2 4 : 
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(1) SEQUENCE: CHARACTERISTICS: 



(A) 


LENGTH: 421 base pairs 






(B) 


TYPE: nucleic acid 








(C) 


STRANDEDNES5 : single 








(D) 


TOPOLOGY: linear 








(XI) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 24 : 




GAATTCGGCA 


CGAGCCGTTT TATTTCCTCT 


GATTTCCTTT 


G C T C GAG T CT 


CGCGGAAGAG 


60 

AGAGAAGAGA 


GGAGAGGAGA GAATGGGTTC 


GACCGGATCC 


GAGACCCAGA 


TGACCCCGAC 


120 
CCAAGTCTCG 


GACGAGGAGG CGAACCTCTT 


CGCCATGCAG 


CTGGCGAGCG 


CCTCCGTGCT 


180 
CCCCATGGTC 


CTCAAGGCCG CCATCGAGCT 


CGACCTCCTC 


GAGATCATGG 


CCAAGGCCGG 


240 

s— r— f— f f* f~* ^ 


TTCCTCTCCC CGGGGGAAGT 


CGCGGCCCAG 


CTCCCGACCC 


AGAACCCCGA 


300 
GGCACCCGTA 


ATGCTCGACC GGATCTTCCG 


GCTGCTGGCC 


AGCTACTCCG 


TGCTCACGTG 


360 
CACCCTCCGC 


GACCTCCCCG ATGGCAAGGT 


CGAGCGGCTC 


TACGGCTTAG 


• v G^-CGGTGTG 


420 
C 










421 










(2) INFORMATION FOR SEQ 


ID NO: 2 5 : 







iii SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 760 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

GGAAGAAGCC GAGCAAACGA ATTGCAGACG CCATTGAAAA AAGACACGAA AGAGATCAAG 
60 

AAGGAGCTTA AGAAGCATCA TCAATGGCAG CCAACGCAGA GCCTCAGCAG ACCCAACCAG 
120 

CGAAGCATTC GGAAGTCGGC CACAAGAGCC TC7TGCAGAG CGATGCTCTC TACCAGTATA 
180 

TATTGGAGAC CAGCGTCTAC CCAAGAGAGC CAGAGCCCAT GAAGGAGCTC AGGGAAATAA 
240 

CAGCCAAACA TCCATGGAAC CTGATGACCA CATCGGCGGA TGAAGGGCAG TTCCTGAACA 
300 

TGCTCCTCAA GCTCATCAAC GCCAAGAACA CCATGGAGAT CGGCGTCTAC ACCGGCTACT 
360 

CTCTCCTCGC AACCGCCCTT GCTCTTCCCG ATGACGGAAA GATCTTGGCC ATGGCCATCA 
420 

ATAGGGAGAA CTTCGAGATC GGGCTGCCCG TCATCCAGAA GGCCGGCCTT GCCCACAAGA 
480 

TCGATTTCAG AGAAGGCCCT GCCCTGCCGC TCCTTGATCA GCTCGTGCAA GATGAGAAGA 
540 

ACCATGGAAC GTACGACTTC TTCTCAATCC TTAATCGTTC ATTTGAATAC AAATACATGC 
600 

TCAATGGTTC AAAGACAACA TAAGACAGAA GATGGAAAAA ATAGAAAGGA AGGAAAGTAT 
660 

TAAGGGTAGT TTCTCATTTC ATCAATGCTT GATTTTGAGA TCTCCTTTCT GGTGCGATCA 
720 

GCTGACCCGG CGGCACAGGT GATGCCATCC CCGACGGGAA 
760 

(2) INFORMATION FOR SEQ ID NO: 26: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 508 base pairs 

(B) TYPE: nucleic acid 
(Ci STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GAATTCGGTA CCCGGGTTCG AAATCGATAA GCTTGGATCC AAAGAATTCG GCACGAGATC 
60 

ACTAACCATC TGCCTTTCTT CATCTTCTTT CTTCTGCTTC TCCTCCGTTT CCTCGTTTCG 
120 

ATATCGTGAA AGGAGTCCGT CGACGACAAT GGCCGAGAAG AGCAAGGTCC TGATCATCGG 
190 

AGGGACGGGC TACGTCGGCA AGTTCATCGT GGAAGCGAGT GCAAAAGCAG GGCATCCCAC 
240 

GTTCGCGCTG GTTAGGCAGA GCACGGTCTC CGACCCCGTC AAGGGCCAGC TCGTCGAGAG 
300 

CTTCAAGAAC TTGGGCGTCA CTCTGCTCAT CGGTGATCTG TACGATCATG AGAGCTTGGT 
360 

GAAGGCAATC AAGCAAGCCG ACGTGGTGAT ATCGACAGTG GGGCACATGC AAATGGCGGA 
420 

TCAGACCAAA GAATCGTCGA CGCCATTAAA GGAAGCTGGC AACGTTAAGG TTTGTTGGTT 
480 

GGTTCATTTG ATCTGGTTTG GGGGGGTC 
508 

(2) INFORMATION FOR SEQ ID NO: 27: 

(D SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 495 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

GAATTCGGCA CGAGGTTAAT GGCAGTGCAG CCTCAACACC ACCCACCTTC CTCCATCTC7 
6C 

CTCCTCCCTT CTTCTTTCTC TGACTTCAAT GGCAGCCGAC TCCATGCTTG CGTTCAGTAT 
120 

AAGAGGAAGG TGGGGCAGCC TAAAGGGGCA CTGCGGGTCA CTGCATCAAG CAATAAGAAG 
180 

ATCCTCATCA TGGGAGGCAC CCGTTTCATC GGTGTGTTTT TGTCGAGACT ACTTGTCAAA 
240 

GAAGGTCATC AGGTCACTTT GTTTACCAGA GGAAAAGCAC CCATCACTCA ACAATTGCCT 
300 

GGTGAGTCGG ACAAGGACTT CGCTGATTTT TCATCCAAGA TCCTGCATTT GAAAGGAGAC 
360 

AGAAAGGATT TTGATTTTGT TAAATCTAGT CTTGCTGCAG AAGGCTTTGA CGTTGTTTAT 
420 

GACATTAACG GCGAGAGGCG GATGAAGTCG CACCAATTTT GGATGCCTGC CAAACCTTGA. 
480 

ACCAGTCAAC TACTG 
495 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



39 



WO 98/11205 



PCT/NZ97/00112 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

GAATTCGGCA CGAGCATAAG CTCTCCCGTA ATCCTCACAT CACATGGCGA AGAGCAAGGT 
60 

CCTCGTCGTT GGCGGCACTG GCTACCTCGG GCGGAGGTTC GTGAGGGCGA GCCTGGACCA 
120 

GGGCCACCCC ACGTACGTCC TCCAGCGTCC GGAGACCGGC CTCGACATTG AGAAGCTCCA 
180 

GACGCTACTG CGCTTCAAGA GGCGTGGCGC CCAACTCGTC GAGGCCTCGT TCTCAGACCT 
240 

GAGGAGCCTC GTCGACGCTG TGAGGCGGGT CGATGTCGTC GTCTGTGCCA TGTCGGGGGT 
300 

CCACTTCCGG AGCCACAACA TCCTGATGCA GCTCAAGCTC GTGGAGGCTA 7CAAAGAAGC 
360 

TGGAAATGTC AAGCGGTTTT TGCCGTCAGA GTTCGGAATG GACCCGGCCC TCATGGGTCA 
420 

TGCAATTGAG CCGGGAAGGG TCACGTTCGA TGAGAAATGG AGGTGAGAAA AG 
472 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 396 base pairs 
(3} TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GAATTCGGCA CGAGGAGGCA CCTCCTCGAA ACGAAGAAGA AGAAGGACGA AGGACGAAGG 
60 

AGACGAAGGC GAGAATGAGC GCGGCGGGCG GTGCCGGGAA GGTCGTGTGC GTGACCGGGu 
120 

CGTCCGGTTA CATCGCCTCG TGGCTCGTCA AGCTCCTCCT CCAGCGCGGC TACACCGTCA 
180 

AGGCCACCGT CCGCGATCCG AATGATCCAA AAAAGACTGA ACATTTGCTT GGACTTGATG 
240 

GAGCGAAAGA TAGACTTCAA CTGTTCAAAG CAAACCTGCT GGAAGAGGGT TCATTTGATC 
300 

CTATTGTTGA GGGTTGTGCA GGCGTTTTTC AAACTGCCTC TCCCTTTTAT CATGATGTCA 
360 

AGGATCCGCA GGCAGAATTA CTTGATCCGG CTGTAA 
396 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 592 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

GAATTCGGCA CGAGGTTGAA CCTCCCGTCC TCGGCTCTGC TCGGCTCGTC ACCCTCTTCG 
60 

CGCTCCCGCA TACTCCACCA CCGCGTACAG AAGATGAGCT CGGAGGGTGG GAAGGAGGAT 
120 

TGCCTCGGTT GGGCTGCCCG GGACCCTTCT GGGTTCCTCT CCCCCTACAA ATTCACCCGC 
180 

AGGGCCGTGG GAAGCGAAGA CGTCTCGATT AAGATCACGC ACTGTGGAGT GTGCTACGCA 
240 



40 



WO 98/11205 



PCT/NZ97/00112 



GATGTGGCTT GGACTAGGAA TGTGCAGGGA CACTCCAAGT ATCCTCTGGT "CAGGGCAC 
300 

GAGATAGTTG GAATTGTGAA ACAGGTTGGC TCCAGTGTCC AACGCTTCAA .-.GTTGGCGAT 
360 

CATGTGGGGG TGGGAACTTA TGTCAATTCA TGCAGAGAGT GCGAGTATTG CAATGACAGG 
420 

GTAGAAGTCC AATGTGAAAA GTCGGTTATG ACTTTTGATG G AAT T G AT GG AGATGGTACA 
480 

GTGAGAAAGG GAGGATATTC TAGTCACATT GTCGTCCATG AAAGGTATTG 33TCAGGATT 
540 

CCAGAAAACT ACCCGATGGA TCTAGCAGGG CATTTGCTCT GTGCTGGATG AC 
592 

(2) INFORMATION FOR SEQ ID NO : 3 1 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 63 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID MO : 3 1 : 

GAATTCGGCA CGAGAACTCA TCTTGAAATG TCATTGGAGT CATCATCCTC TAGTGAGAAG 
60 

AAACAAATGG GTTCCGCCGG ATTCGAATCG GCCACAAAGC CGCACGCCG7 TTGCATTCCC 
L20 

TACCCTGCAC AAAGCCACAT TGGCGCCATG CTCAAGCTAG CAAAGCTCCT CCATCACAAG 
180 

GGCTTCCACA TCTCCTTCGT CAACACCGAG TTCAACCACC GGCGGCTCGC CAGGGCTCGA 
240 

GGCCCCGAGT TCACAAATGG AATGCTGAGC GACTTTCAGT TCCTGACAAT CCCCGATGGT 
300 

CTTCCTCCTT CGGACTTGGA TGCGATCCAA GACATCAAGA TGCTCTGCGA ATCGTCCAGG 
360 

AACTATATGG TCAGCCCCAT CAACGATCTT GTATCGAGCC TGGGCTCGAA CCCGAGCGTC 
420 

CCTCCGGTGA CTTGCATCAA TCTCGGATGG TTTCATGACA CTCGTGAC 
469 

(2) INFORMATION FOR SEQ ID NO: 32: 

Ui SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 405 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

CTTTACTCCG CCAAGAAGAT CCAATCGCAG TTTTCGCAAT TGGCCCATTA CACAAATGCG 
60 

GTCCATCTTC ATCGGGAAGT CTCTTGGCAG AAGACCGGAG TTGCATTTCC TGGCTGGACA 
120 

AGCAAGCCCC TAACTCAGTG GTCTATGTGA GTCTTGGGAG CATCGCCTCT GTGAACGAGT 
180 

CGGAATTTTC CGAAATAGCT TTAGGTTTAG CCGATAGCCA GCAGCCATTC TTGTGGGTGG 
240 

TTCGACCCGG GTCAGTGAGC GGCTCGGAAC TCTTAGAGAA TTTGCCCGGT TGCTTTCTGG 
300 

AGGCATTACA GGAGAGGGGG AAGATTGTGA AATGGGCGCC TCAACATGAA GTGCTGGCTC 
360 

ATCGGGCTGT CGGAGCGTTT TGGACTCACA ATGGATGGAA CTCCA 
405 
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(2) INFORMATION FOR SEQ ID NO : 2 2 : 

(!) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 380 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 
{ D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GGCAAACACG CCCGTTTTCG TTTTACTAAG AGAAGATGGT GAGCGTTGTG GCTGGTAGAG 
60 

TCGAGAGCTT GTCGAGCAGT GGCATTCAGT CGA7CCCGCA GGAGTATGTG AGGCCGAAGG 
120 

AGGAGCTCAC AAGCATTGGC GACATCTTCG AGGAGGAGAA GAAGCATGAG GGCCCTCAGG 
180 

TCCCGACCAT CGACCTCGAG GACATAGCGT CTAAAGACCC CGTGGTGAGG GAGAGGTGCC 
240 

ACGAGGAGCT CAGGAAGGCT GCCACCGACT GGGGCGTCAT GCACC7CGTC AACCATGGGA 
300 

TCCCCAACGA CCTGATTGAG CGTGTAAAGA AGGCTGGCGA GGTGTTCTTC AACCTCCCGA 
360 

TCGAGGAGAA GGACAAGCAT 
380 

(2) INFORMATION FOR SEQ ID NO: 34: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 305 base pairs 
{BJ TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



TTGTACCCGA 


AGATCTCCGG 


GACCGTTCGA 


CGGCGACATC 


GCCGTCGGCC 


GGGAACCCGT 


60 

CGAGGCCGCC 


GCCGGAGGCC 


GGGGAGAAGC 


TGGAGT A.\j'w 


GCCGTAGCCG 


G AG AAG G C G C 


120 
CGTCGTGGTC 


GGCGGCGGCG 


GCGTGGTGGA 


CCTCATCGCC 


GTCCATGCTG 


AAGGCGTCGA 


180 
AGGAAGCGGA 


CATGGCTGGG 


GGATCGATCG 


ACCGATCCGA 


TCGGCCGGAG 


GATTTCGAGA 


240 
TCGGAGATGG 


AGAGATGGAA 


ATGAAAGAGA 


GAGAGAGAGA 


GAGATCCGGT 


GGACTGGTGG 


300 












TGTTT 












305 












(2) IN FORMAT 


ION FOR SEQ 


ID NO: 35 : 







(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 693 base pairs 
■3) TYPE: nucleic acid 
;C) STRANDEDNESS : single 
( D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GAATTCGGCA CGAGCTAAGA GAGGAGAGGA GAGGAGCAAG ATGGCACTAG CAGGAGCTGC 
60 

ACTGTCAGGA ACCGTGGTGA GCTCCCCCTT TGTGAGGATG CAGCCTGTGA ACAGACTCAG 
120 
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GGCATTCCCC AATGTGGGTC AGGCCCTGTT TGGTGTCAAC 
180 

TGCCATGGCC GCTTACAAGG TCACCCTGCT CACCCCTGAA 
240 

CCCCGACGAT GTTTACATCT TGGACTACGC CGAGGAGCAA 
300 

CTGCCGTGCC GGCTCTTGCT CCTCCTGCGC GGGCAAGGTC 
360 

GAGCGACGGC AGCTTCCTGG ATGATGATCA GATTGAGGAA 
420 

CGCCTACCCT AAGTCTGAGG TCACCATTGA GACCCACAAG 
480 

AAGCTCTCCT ATATTTGCTT TTGCATAAAT CAGTCTCACT 
540 

TCTCCCCCC7 TCACTACATG TTTGT7AGTT CC7TTAGTCT 
600 

GGGATGATTT GATGTTATTC TGAGTCTAAT GTAATGGCTT 
660 

TGAGGAAATA AAACTCATGC TCTAAAAAAA AAA 
693 

(2) INFORMATION FOR SEQ 10 NO: 36: 

ii) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 18 base pairs 
(E) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xii SEQUENCE DESCRIPTION: SEQ ID NO:3€: 

AGGACTTTAT TATAAGCATT GTAAAAAGAG TCAAACTAAT ACATCGCAAG AATTGGGTTA 
60 

TCCAATAATC TACAAAAAGA AAAAAGTTTG ATGCATTGAG ATGGTAACTG CTTAATTCAA 
120 

ATGCCTTAGT TTGAAAAATT AACCAACTAT TAAAATTAAT GATGATGAAT ATGGATT-TG 
160 

TGTGAAAAAC TATATAGACT TAAAATTGAC TCAGAAGACA TTCTTTTCTT CTTATTTTAT 
240 

GATATGATGA ATTCGGTCTA AACAGGCAAA TGGTGTCAAA CGGGAAGTCG GCAAAAC~C^ 
300 

TCCTCGCCAG TGACTACCGG GCGGGCGATG ATGCGGATCC GGGGGCCGGG TCGCTGGAG- 
360 

ACATCCCGCA CGGACCGGTC CACGTTTGGT GCGGTGACAA CAGGCAGCCC AACCTGGP 
4 18 



-2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 777 base Dairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GAATTCGGCA CGAGCATACA ACTACACTGC GACGCCGCCG CAGAACGCGA GCGTGCCGAC 
60 

CATGAACGGC ACCAAGGTCT ACCGGTTGCC GTATAACGCT ACGGTCCAGC TCGTTTTACA 
120 

GGACACCGGG ATAATCGCGC CGGAGACCCA CCCCATCCAT CTGCACGGAT TCAACTTC T T 
180 

CGGTGTGGGC AAAGGAGTGG GGAATTATGA CCCAAAGAAG GATCCCAAGA AGTTCAATCT 
240 



TCTGGCCGTG 
GGCAAAGTCG 
GGCATCGACT 
GTGGCGGGGA 
GGTTGGGTCC 
GAAGAGGAGC 
CTACGCAACT 
CTTCCTTTTT 
TTCTTTTTCC 



GCAGAGTGAC 
AACTCGACGT 
TCCCCTACTC 
GCGTCGACCA 
TCACTTGTGT 
TCACTGCTTG 
TTCTCCACTC 
TACTGTACGA 
TATTTCTGTA 
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GGTTGACCCA GTGGAGAGGA ACACCATTGG AATCCCA7C7 GG7GGATGGA TAGCCA7CAG 
300 

ATTCACAGCA GACAATCCAG GAGTTTGGTT CCTGCACTGC CATCTGGAAG TGCACACAAC 
360 

TTGGGGACTG AAGATGGCAT TCT7GG7GGA CAATGGGAAG GGGCCTAAAG AGACCCTGCT 
420 

TCCACCTCCA AGTGATCTTC CAAAATGTTG ATCATTTGAT CATGAGGACG ACAAGCGATT 
480 

ACTAATGACA CCAAGTTAGT GGAATCTTCT CTTTGAAAAA GAAGAAGAAG AGCAAGAAGA 
540 

ATAAGAAAGA TGAGGAGAGA AGCCATAGAA GATTTGACCA AGAAGAGAGA GGGCAATAAA 
600 

CCAAAGAGAC CCTTGAGATC ACGACATCCC GCAATTG77T CTAGAGTAAT AGAAGGATTT 
660 

ACTCCGACAC TGCTACAATA AATTAAGGAA GACAAGGAAT TTGGT7TTTT TCATTGGAGG 
720 

AGTGTAATTT GTTTTTTGGC AAGCTCATCA CATGAATCAC A T G G AAAAAA AAAAAAA 



(2) INFORMATION FOR SEQ ID NO: 38: 

U; SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 344 base pairs 
(3) TYPE: nucleic acid 

(C) 3TRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

ATATGTTCAG AATTTCAAAT GTGGGAATGT CAACCTCCTT GAACTTCAGA ATTCAGGGCC 
60 

ATACGTTGAA GCTAGTCGAG GTTGAAGGAT CTCACACCG7 CCAGAACATG TATGATTCAA 
120 

TCGATGTTCA CGTGGGCCAA TCCATGGCTG TCTTAGTGAC CTTAAATCAG CCTCCAAAGC 
180 

ACTACTACAT TGTCGCATCC ACCCGGTTCA CCAAGACGGT TCTCAATGCA ACTGCAGTGC 
240 

TACACTACAC CAACTCGCTT ACCCCAGTTT CCGGGCCACT ACCAGCTGGT CCAACTTACC 
300 

AAAAAC AT T G GTCCATGAAG C AAG C AAG AA CAATCAGGTG GAAC 
344 

(2) INFORMATION FOR SEQ ID NO: 39: 

( i ) SEQUENCE CHARACTER I ST ICS : 
(A) LENGTH: 34 1 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GCCGCAACTG CAATTCTCTT CGTAAAACAT GACGGCTGTC GGCAAAACCT CTTTCCTCTT 
60 

GGGAGCTCTC CTCCTCTTCT CTGTGGCGGT GACATTGGCA GATGCAAAAG TTTACTACCA 
120 

TGATTTTGTC GTTCAAGCGA CCAAGGTGAA GAGGCTGTGC ACGACCCACA ACACCATCAC 
180 

GGTGAACGGG CAATTCCCGG GTCCGACTTT GGAAGTTAAC GACGGCGACA CCCTCGTTGT 
240 

CAATGTCG7C AACAAAGCTC GCTACAACGT CACCATTCAC TGGCACGGCG TCCGGCAGGT 
300 

GAGATCTGG7 7GGGCTGATG GGGCGGAATT TGTGACTCAA T 
34 1 



44 
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(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 
{A} LENGTH: 358 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 



GAATTCGGCA 
60 

CAGAATTCAG 


CGAGATATGT 


TCAGAATTTC 


AAATGTGGGA 


ATGTCAACCT 


CCTTGAACTT 


GGCCATACGT 


TGAAGCTAGT 


CGAGGTTGAA 


GGATCTCACA 


CCGTCCAGAA 


120 

CATGTATGAT 
180 

TCAGCCTCCA 
240 

TGCAACTGCA 
300 

TGGTCCAACT 
358 


TCAATCGATG 


TTCACGTGGG 


CCAATCCATG 


GCTGTCTTAG 


TGACCTTAAA 


AAGGACTACT 


ACATTGTCGC 


ATCCACCCGG 


TTCACCAAGA 


CGGTTCTCAA 


GTGCTACACT 


ACACCAACTC 


GCTTACCCCA 


GTTTCCGGGC 


CACTACCAGC 


TACCAAAAAC 


ATTGGTCCAT 


GAAGCAAGCA 


AGAACAATCA 


GGTGGAAC 



(2) INFORMATION FOR SEQ ID NO : 4 1 : 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 409 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 1 : 

ATCAAGAGTT TGAGTCTAAA CCTTGTCTAA TCCTCTCTCG CATAGTCATT TGGAGACGAA 
60 

TGCTGATCGG CCGCAGCTGC ATTCTCTTCG TAAAACATGA CGGCTGTCGG CAAAACCTCT 
120 

TTCCTCTTGG GAGCTCTCCT CCTCTTCTCT G7GGCGGTGA CATTGGCAGA TGCAAAAGTT 
180 

TACTACCATG ATTTTGTCGT TCAAGCGACC AAGGTGAAGA GGCTGTGCAC GACCCACAAC 
240 

ACCATCACGG TGAACGGGCA ATTCCCGGGT CCGACTTTGG AAGTTAACGA CGGCGACACC 
300 

CTCGTTGTCA ATGTCGTCAA CAAAGCTCGC TACAACGTCA CCATTCACTG GCACGGCGTC 
360 

CGGCAGGTGA GATCTGGTTG GGCTGATGGG GCGGAATTTG TGACTCAAT 
409 

\2) INFORMATION FOR SEQ ID NO : 4 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 2 : 

CTCTCTCTCT CTCTCTCTCT GTGTGTTCAT TCTCGTTGAG CTCGTGGTCG CCTCCCGCCA 
60 

TGGATCCGCA CAAGTACCGT CCATCCAGTG CTTTCAACAC TTCTTTCTGG ACTACGAACT 
12 0 
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CTGGTGCTCC TGTCTGGAAC AATAACTCTT CG7TGACTGT TGGAAGCAGA GGTCCAATTC 
180 

TTCTTGAGGA TTATCACCTC GTGGAGAAAC TTGCCAACTT TGATAGGGAG AGGATTCCAG 
240 

AGCG7GTGGT GCATGCCAGA GGAGCCAGTG CAAAGGGA7T CTTTGAGGTC ACTCATGACA 
300 

TTTCCCAGCT TACCTGTGCT GATTTCCTTC GGGCACCAGG AGTTCAAACA CCCGTGATTG 
360 

TCCGTTTCTC CACTGTCATC CACGAAAGGG GCAGCCCTGA AACCCTGAGG GACCCTCGAG 
420 

GTTTTGCTGT GAAGTTCTAC ACAAGAGAGG GTAACTTTGA TCTGGTGGGA AACAATTTCC 
480 

CTGTCTTCTT TGTCCGTAAT GGGATAAATT CCCCG 
515 

(2) INFORMATION FOR SEQ ID NO : 4 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 471 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID HO : 4 3 : 

GAATTCGGCA CGAGGCTCCC TCTCGTACTG CCATACTCCT GGGACGGGAT TCGGATAGGG 
60 

ATTTGCGGCG ATCCATTTCT CGATTCAAGG GGAAGAATCA TGGGGAAGTC CTACCCGACC 
120 

GTAAGCCAGG AGTACAAGAA GGCTGTCGAG AAATGCAAGA AGAAGTTGAG AGGCCTCATC 
180 

GCTGAGAAGA GCTGCGCTCC GCTCATGCTC CGCATCGCGT GGCACTCCGC CGGTACCTTC 
240 

GATGTGAAGA CGAAGACCGG AGGCCCGTTC GGGACCATGA AGCACGCCGC GGAGCTCAGC 
300 

CACGGGGCCA ACAGCGGGCT CGACGTTGCC GA7CAGCTCT TGCAGCCGAT CAAGGATCAG 
360 

TTCCCCGTCA TCACTTATGC TGATTTCTAC CAGCTGGCTG GCGTCGTTGC TGTGGAAGTT 
420 

ACTGGTGGAC CTGAAGTTGC TTTTCACCCG GAAGAGAGGC AAACCACAAC C 
471 

(2) INFORMATION FOR SEQ ID NO : 4 4 : 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 base pairs 

(B) TYPE: nucieic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

GAATTCGGCA CGAGCTCCCA CTTCTGTCTC GCCACCATTA CTAGCTTCAA AGCCCAGATC 
60 

TCAGTTTCGT GCTCTCTTCG TCATCTCTGC CTCTTGCCAT GGATCCGTAC AAGTATCGCC 
120 

CGTCCAGCGC TTACGATTCC AGCTTTTGGA CAACCAACTA CGGTGCTCCC GTCTGGAACA 
180 

ATGACTCATC GCTGACTGTT GGAACTAGAG GTCCGATTCT CCTGGAGGAC TACCATCTGA 
240 

TTGAGAAACT TGCCAACTTC GAGAGAGAGA GGATTCCTGA GCGGGTGGTC CATGCACGGG 
300 

GAGCCAGCGC GAAAGGGTTC TTCGAGGTCA CCCACGACAT CTCTCACTTG ACCTGTGCTG 
360 
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ATTTCCTCCG GGCTCCTGGA GTCCAGACGC CCGTAATCGT CCSTTTC7CC ACCGTCATCC 
420 

ACGAGCGCGG CAGCCCGAAC CTCAGGGACC CTCGTGGTTT TGCAGTGAAG TTCTACACCA 
480 

GAGAGGG 
487 

(2) INFORMATION FOR SEQ ID NO: 45: 

(l) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 634 base pairs 
!B) TYPE: nucleic acici 
id STRANDEDNESS : singie 
(D) TOPOLOGY: linear 



(xii SEQUENCE DESCRIPTION: SEQ ID NO : 4 5 : 

GAATTCCTGC AGCCCGGGGG ATCCACTAGT TCTAGAGCGG CCGCCACCGC GGTGGAGCTC 
60 

GCGCGCCTGC AGGTCGACAC TAGTGGATCC AAAGAATTCG GCACGAGGCC CGACGGCCAC 
120 

TTGTTGGACG CCATGGAAGC TCTCCGGAAA GCCGGGATTC TGGAACCGTT TAAACTGCAG 
180 

CCCAAGGAAG GACTGGCTCT CGTCAACGGC ACAGCGGTGG GATCCGCCGT GGCCGCGTCC 
240 

GTCTGTTTTG ACGCCAACGT GCTGGGCGTG CTGGCTGAGA TTCTGTCTGC GCTCTTCTGC 
300 

GAGGTGATGC AAGGGAAACC GGAGTTCGTA GATCCGTTAA CCCACCAGTT GAAGCACCAC 
360 

CCAGGGCAGA TCGAAGCCGC GGCCGTCATG GAGTTCCTCC TCGACGGTAG CGACTACGTG 
420 

AAAGAAGCAG CGCGGCTTCA CGAGAAAGAC CCGTTGAGCA AACCGAAACA AGACCGCTAC 
480 

GCTCTGCGAA CATCGCCACA GTGGTTGGGG CCTCCGATCG AAGTCATCCG CGCTGCTACT 
540 

CACTCCATCG AGCGGGAGAT CAATTCCGTC AACGACAATC CGTTAATCGA TGTCTCCAGG 
600 

GACATGGCTC TCCACGGCGG CAACTTCCAG GGAACACCCA TCGGAGTTTC CATGGACAAC 
660 

ATGCGAATCT CTTTGGCAGC CGTC 
684 

(2) INFORMATION FOR SEQ ID NO : 4 6 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 18 base pairs 
{ B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 6 : 

GAATTCGGCA CGAGGACAAG GTCATAGGCC CTCTCTTCAA ATGCTTGGAT GGGTGGAAAG 
60 

GAACTCCTGG CCCATTCTGA AATAAATAAT CTTCCAAGAT CGCCTTTATA CAACGACTGC 
120 

TATGATTTGA GTCCTCGGAT CTTTTTGTTG ATGCAGTTGT TTACCGATCT GGAATTTGAT 
180 

TGGTCATAAA GCTTGATTTT GTTTTTCTTT CTTTTGTTTT ATACTGCTGG ATTTGCATCC 
240 

CATTGGATTT GCCAGAAATA TGTAAGGGTG GCAGATCATT TGGGTGATCT GAAACATGTA 
30C 

AAAGTGGCGG ATCATTTGGG TAGCATGCAG ATCAGTTGGG TGATCGTGTA CTGCTTTCAC 
360 
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TATTACTTAC ATATTTAAAG ATCGGGAATA AAAACATGAT TTTAATTGAA AAAAAAAA 
4 18 

(2) INFORMATION FOR SEQ ID NO : -1 1 : 

(i) SEQUENCE CHARACTERISTICS : 
(Ai LENGTH: 479 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 
(DJ TOPOLOGY: linear 



(xi; SEQUENCE DESCRIPTION: SEQ ID NO : 4 7 : 

GATATCCCAA CGACCGAAAA CCTGTATTTT CAGGGCGCCA TGGGGATCCG GAATTCGGCA 
60 

CGAGCAAGGA AGAAAATATG GTTGCAGCAG CAGAAATTAC GCAGGCCAAT GAAGTTCAAG 
120 

TTAAAAGCAC TGGGCTGTGC ACGGACTTCG GCTCGTCTGG CAGCGATCCA CTGAACTGGG 
180 

TTCGAGCAGC CAAGGCCATG GAAGGAAGTC ACTT7GAAGA AGTGAAAGCG ATGGTGGATT 
240 

CGTATTTGGG AGCCAAGGAG ATTTCCATTG AAGGGAAATC TCTGACAATC TCAGACGTTG 
300 

CTGCCGTTGC TCGAAGATCG CAAGTGAAAG TGAAATTGGA TGCTGCGGCT GCCAAATCTA 
360 

GGGTCGAGGA GAGTTCAAAC TGGGTTCTCA CCCAGATGAC CAAGGGGACG GATACCTATG 
420 

GTGTCACTAC TGGTTTCGGA GCCACTTCTC ACAGGAGAAC GAACCAGGGA GCCGAGCTT 
479 

(2) INFORMATION FOR SEQ ID NC : 4 8 : 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1785 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(DJ TOPOLOGY: linear 



(xi : SEQUENCE DESCRIPTION: SEQ ID NO : 4 8 : 

TATCGATAAG CTTGATATCG AATTCCTGCA GCCCGGGGGA TCCACTAGTT CTAGAGCGGC 
60 

CGCCACCGCG GTGGAGCTCG CGCGCCTGCA GGTCGACACT AGTGGATCCA AAGAATTCGG 
120 

CACGAGGTTG CAGGTCGGGG ATGATTTGAA TCACAGAAAC CTCAGCGATT TTGCCAAGAA 
180 

ATATGGCAAA ATCTTTCTGC TCAAGATGGG CCAGAGGAAT CTTGTGGTAG TTTCATCTCC 
240 

CGATCTCGCC AAGGAGGTCC TGCACACCCA GGGCGTCGAG TTTGGGTCTC GAACCCGGAA 
300 

CGTGGTGTTC GATATCTTCA CGGGCAAGGG GCAGGACATG GTGTTCACCG TCTATGGAGA 
360 

TCACTGGAGA AAGATGCGCA GGATCATGAC TGTGCCTTTC TTTACGAATA AAGTTGTCCA 
420 

GC ACT AC AG A TTCGCGTGGG AAGACGAGAT CAGCCGCGTG GTCGCGGATG TGAAATCCCG 
480 

CGCCGAGTCT TCCACCTCGG GCATTGTCAT CCGTAGGCGC CTCCAGCTCA TGATGTATAA 
540 

TATTATGTAT AGGATGATGT TCGACAGGAG ATTCGAATCC GAGGACGACC CGCTTTTCCT 
600 

CAAGCTCAAG GCCCTCAACG GAGAGCGAAG TCGATTGGCC CAGAGCTTTG AGTACAATTA 
660 

TGGGGATTTC ATTCCCATTC TTAGGCCCTT CCTCAGAGGT TATCTCAGAA TCTGCAATGA 
720 
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GATTAAAGAG 
730 

CAACAGTACC 
840 

TAGATGCTCA 
900 

TCAACGTTGC 
96C 

TGAACCACCA 
1020 

GCGTGCAGAT 
1080 

AAACCCTTCG 
1140 

CCAAGCTCGG 
1200 

TGGCCAACAA 
1260 

AGGAGGAGAA 
1320 

GAGGAGGAGC 
1390 

ACTTGTTCAG 
144C 

GAAGGGCGGG 
1500 

AGCTTCTGCT 
1560 

AAACACTCCA 
1620 

TAGCACTTCA 
1680 

AATAAAAGTT 
1740 

GGGGAA7TTT 
1785 



AAACGGCTC7 
AAGACTAGTA 
GGACAAGGGA 
AGCAATTGAG 
GGACATTCAG 
AACGGAACCA 
TCTCCGCATG 
GGGCTACGAT 
CCCCGCCAAC 
GCACACCGAA 
TGCCCGGGAA 
AACTTCCACC 
CAGTTCAGCC 
TAATCCCAAC 
TCTATCATGA 
AAAGTTTGCT 
TGCATAAATT 
ACTGCTAAAA 



CTCTTTTCAA 
CCAACACCGG 
GAGATCAATG 
ACAACGCTGT 
AGCAAGGTGC 
GACACGACAA 
GCGATCCCGT 
ATTCCGGCAG 
TGGAAGAACC 
GCCAATGGCA 
TCATTCTGGC 
TTCTGCCGCC 
TTCACATTCT 
TTGTCAGTGA 
CTGTGTGTGC 
AGGATTTCAA 
AAATGATATT 
AAAAAAAAAA 



(jOAv. i w : l ^ 

GGGAGCTCAA 
AGGATAATGT 
GGTCGATGGA 
GCGCAGAGCT 
GGTTGCCCTA 
TGCTCGTCCC 
AGAGCAAGAT 
CCGAGGAGTT 
ACGACTTCAA 
GCTGCCTCTC 
GCCCGGGCAG 
CAACCATTCT 
CTGGTATATA 
GTGTCCACTG 
TAACAGACAC 
TCAATATACT 
AAAAAAAAAA 

ZD NO: 49: 



ID NO: 50 : 
49 



G 7 G G AAG AG C 
GTGTGCAATG 
TTTGTACATC 
ATGGGGAATA 
GGACGCTGTT 
CCTTCAGGCG 
CCACATGAAT 
CCTGG7GAA.C 
CCGCCCCGAG 
AT7CC7GCC7 
C7CGCACTCT 
AGCAAAG7GG 
CTCA7CGTCG 
AATGCGCGCA 
7CGAGTC7AC 
CGTCAAT7A7 
A77TTGACTC 
AAAAA 



3CAAGAAGCT 

GACCATATT7 

GTTGAGAACA 

GCGGAGCTGG 

7TTGGACCAG 

G7TGTG AAGG 

C7CCACGACG 

GCCTGGT 

TGG77C77CG 

TCGGTGTGGG 

CCATCGGAAG 

A7GTCAC7GA 

77AAGCCCA7 

"TGAACAAA 

7AAGAGCTCA 

G7CATG777C 

7 CC AC CAATT 



7CCTCTGGCT 
7ATGGCGTAA 
7TCAGATTGG 
ACGGTCCAA7 
AGCT7GCCCA 
TCGGGATGCA 
CGCTG7GGCG 
GGCTT 



(2! INFORMATION FOR SEQ 

H) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 4 75 base pairs 
iB) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 9 : 

GAATTCGGCA CGAGATTTCC ATGGACGATT CCGTTTGGCT TCAATTCGTT 
60 

GTCCTCGTCC TCGTTTTCCT TGTTCTTCCT CCGACTTTTT CTCTGGAAGC 
120 

TAGGAACCTG CCGCCAGGAC CCCCGGCATG GCCGATCGTA GGGAACGTCC 
180 

ATTTTCCAGC GGCGCGTTCG AGACCTCAGT GAAGAAATTC CATGAGAGAT 
240 

ATTCACTGTG TGGCTCGGTT CCCGCCCTCT GCTGATGATC ACCGACCGCG 
300 

CGAGGCGCTC GTACAGAAGG GCTCCGTCTT* CGCTGACCGC CCGCCCGCCC 
360 

GAAAATCTTC AGTAGCAACC AGCACAACAT CACTTCGGCT GAATACGGCC 
420 

GAGCCTTCGC AGGAATCTGG TTAAAGAAGC CCTGAGACTT CGGCGATGAA 
475 

(2) INFORMATION FOR SEQ 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 801 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : . SEQ ID NO: 50: 

GCTCCACCGA CGGTGGACGG TCCGCTACTC AGTAACTGAG TGGGATCCCC CGGGCTGACA 
60 

GGCAATTCGA TTTAGCTCAC TCATTAGGCA CCCCAGGCTT TACACTTTAT GCTTCCGGCT 
120 

CGTATGTTGT GTGGAATTGT GAGCGGATAA CAATTTCACA CAGGAAACAG CTATGACCAT 
180 

GATTACGCCA AGCGCGCAAT TAACCCTCAC TAAAGGGAAC AAAAGCTGGA GCTCCACCGC 
240 

GGTGGCGGCC GCTCTAGAAC TAGTGGATCC AAAGAATTCG GCACGAGACC CAGTGACCTT 
300 

CAGGCCTGAG AGATTTCTTG AGGAAGATGT TGATATTAAG GGCCATGATT ACAGGCTACT 
360 

GCCATTCGGT GCAGGGCGCA GGATCTGCCC TGGTGCACAA TTGGGTATTA ATTTAGTTCA 
420 

GTCTATGTTG GGACACCTGC TTCATCATTT CGTATGGGCA CCTCCTGAGG GAATGAAGGC 
480 

AGAAGACATA GATCTCACAG AGAATCCAGG GCTTGTTACT TTCATGGCCA AGCCTGTGCA 
540 

GGCCATTGCT ATTCCTCGAT TGCCTGATCA TCTCTACAAG CGACAGCCAC 7CAATTGATC 
600 

AATTGATCTG ATAGTAAGTT TGAATTTTGT TTTGATACAA AA CG AAAT AA CGTGCAGTTT 
660 

CTCCTTTTCC ATAGTCAACA TGCAGCTTTC TTTCTCTGAA GCGCATGCAG CTTTCTTTCT 
720 

CTGAAGCCCA ACTTCTAGCA AGCAATAACT GTATATTTTA GAACAAATAC CTATTCCTCA 
780 

AATTGAGTAT TTCTCTGTAG G 
801 

(2) INFORMATION FOR SEQ ID NO : 5 1 : 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 744 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 1 : 

GGGCCCCCCT TCGAGGTGGA CACTAGTGGA TCCAAAGAAT TCGGCACGAG GTTTTATCTG 
60 

AAGGACGCTG TGCTTGAAGG CTCCCAGCCA TTCACCAAAG CCCATGGAAT GAATGCGTTC 
120 

GAGTACCCGG CCATCGATCA GAGATTCAAC AAGATTTTCA ACAGGGCTAT GTCTGAGAAT 
180 

TCTACCATGT TGATGAACAA GATTTTGGAT ACTTACGAGG GTTTTAAGGA GGTTCAGGAG 
240 

TTGGTGGATG TGGGAGGAGG TATTGGGTCG ACTCTCAATC TCATAGTGTC TAGGTATCCC 
300 

CACATTTCAG GAATCAACTT CGACTTGTCC CATGTGCTGG CCGATGCTCC TCACTACCCA 
360 

GCTGTGAAAC ATGTGGGTGG AGACATGTTT GATAGTGTAC CAAGTGGCCA AGCTATTTTT 
420 

ATGAAGTGGA TTCTGCATGA TTGGAGCGAT GATCATTGCA GGAAGCTTTT GAAGAATTGT 
480 
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AGCCCTTGCA TTGCCCGATG ATGGAAAGAT TCTAGCCATG GACATCAACA GAGAGAACTA 
540 

TGATATCGGA TTGCCTATAA TT 
562 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1074 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDNESS : single 
(D) TOPOLOGY: Linear 



(xii SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

TCGTGCCGCT CGATCCTCAC AGGCCCTTTT TATTTCCCTG GTGAACGATA CGATGGGCTC 
60 

GCACGCTGAG AATGGCAACG GGGTGGAGGT TGTTGATCCA ACGGACTTAA CTGACATCGA 
120 

GAATGGGAAA CCAGGTTATG ACAAGCGTAC GCTGCCTGCG GACTGGAAGT 7TGGAGTSAA 
180 

GCTTCAAAAC GTTATGGAAG AATCCATTTA CAAGTACATG CTGGAAACAT TCACCCGCCA 
240 

TCGAGAGGAC GAGGCGTCCA AGGAGCTCTG GGAACGAACA TGGAACCTGA CACAGAGAGG 
300 

GGAGATGA73 ACATTGCCAG ATCAGGTGCA GTTCCTGCGC TTGATGGTAA AGATGTCAGG 
360 

TGCTAAAAAG GCATTGGAGA TCGGAGTTTT CACTGGCTAT TCATTGCTCA ATATCGCTCT 
420 

CGCTCTTCCT TCTGATGGCA AGGTGGTAGC TGTGGATCCA GGAGATGACC CCAAATTTGG 
480 

CTGGCCCTGC TTCGTTAAGG CTGGAGTTCC AGACAAAGTG GAGATCAAGA AAACTACAGG 
540 

GTTGGACTA7 TTGGATTCCC TTATTCAAAA GGGGGAGAAG GATTGCTTCG ACTTTGCATT 
600 

CGTGGACGCA GACAAAGTGA ACTACGTGAA CTATCATCCA CGGCTGATGA AGTTAGTGCG 
660 

CGTGGGGGGC GTCATAATTT ACGACGACAC CCTCTGGTTT GGTCTGGTGG GAGGAAAGGA 
720 

TCCCCACAAC CTGCTTAAGA AT GAT T AC AT GAGGACTTCT CTGGAGGGTA TCAAGGCCAT 
780 

CAACTCCA7G GTAGCCAACG ACCCCAACTT GGAGGTCGCC ACAGTCTTTA TGGGATATGG 
840 

TGTCACTGT7 TGTTACCGCA CTGCTTAGTT AGCTAGTCCT CCGTCATTCT GCTATGTATG 
900 

TATATGATAA TGGCGTCGAT TTCTGATATA GGTGGTTTTT CAATGTTTCT ATCGTCATGT 
960 

TTTCTGTTTA GCCAGAATGT TTCGATCGTC ATGGTTTCTG TTAAAGCCAG AATAAAATTA 
1020 

GCCGCTTGCA GTTCAAAAAA AAAAAAAAAA AAAAACTCGA GACTAGTTCT CTTC 
1074 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1075 base pairs 
(5} TYPE: nucleic acid 
iC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi- SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

TCGGAGCTC7 CGAATCCTCA CAGGCCCTTT TTATTTCCCT GGTGAACGAT ACGATGGGCT 
60 



52 



WO 98/11205 



PCT/NZ97/00112 



CACAAGGCGT TGCCAGAGAA GGGGAAGG7G ATTGCGGTGG ACACCATTCT CCCAGTGGC7 
540 

GCAGAGACAT CTCCTTATGC TCGTCAGGGA TTTCATACAG ATTTACTGAT GTTGGCATAC 
600 

AACCCAGGGG GCAAGGAACG CACAGAGCAA GAATTTCAAG ATTTAGCTAA GGAGACGGGA 
660 

TTTGCAGGTG GTGTTGAACC TGTATGTTGT GTCAATGGAA TGTGGGTAAT GGAATTCCTG 
720 

CAGCCCGGGG GATCCACTAG TTCT 
744 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

GTGGCCCTGG AAGTAG7GTG CGCGACATGG ATTCCTTGAA TTTGAACGAG TTTATGTTGT 
60 

GGTTTCTCTC TTGGCTTGCT CTCTACATTG GATTTCGTTA TGTTTTGAGA TCGAACTTGA 
120 

AGCTCAAGAA GAGGCGCCTC CCGCCGGGCC CATCGGGATG GCCAGTGGTG GGAAGTCTGC 
180 

CATTGCTGGG AGCGATGCCT CACGTTACTC TCTACAACAT GTATAAGAAA TATGGCCCCG 
240 

TTGTCTATCT CAAACTGGGG ACGTCCGACA TGGTTGTGGC CTCCACGCCC GCTGCAGCTA 
300 

AGGCGTTTCT GAAGACTTTG GATATAAACT TCTCCAACCG GCCGGGAAAT GCAGGAGCCA 
360 

CGTACATCGC CTACGATTCT CAGGACATGG TGTGGGCAGC GTATGGAGGA CGGTGGAAGA 
420 

TGGAGC 
<J26 

(2) INFORMATION FOR SEQ ID NO : 5 3 : 

(i) SEQUENCE CHARACTERISTICS: 
(A} LENGTH: 562 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

CAGTTCGAAA TTAACCTCAC TAAAGGGAAC AAAAGCTGGA GTTCGCGCGC CTGCAGGTCG 
60 

ACACTAGTGG ATCCAAAGAA TTCGGCACGA GCTTTGAGGC AACCTACATT CATTGAATCC 
120 

CAGGATTTCT TCTTGTCCAA ACAGGTTTAA GGAAATGGCA GGCACAAGTG TTGCTGCAGC 
180 

AGAGGTGAAG GCTCAGACAA CCCAAGCAGA GGAGCCGGTT AAGGTTGTCC GCCATCAAGA 
240 

AGTGGGACAC AAAAGTCTTT TGCAGAGCGA TGCCCTCTAT CAGTATATAT TGGAAACGAG 
300 

CGTGTACCCT CGTGAGCCCG AGCCAATGAA GGAGCTCCGC GAAGTGACTG CCAAGCATCC 
360 

CTGGAACCTC ATGACTACTT CTGCCGATGA GGGTCAATTT CTGGGCCTCC TGCTGAAGCT 
420 

CATTAACGCC AAGAACACCA TGGAGATTGG GGTGTACACT 3GTTACTCGC TTCTCAGCAC 
480 
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AAGCCGACGA AACCCAATGC CCGGCCGTGA CAATCCACCC GGACGATGTC G7GGCGTTGC 
660 

CCTATTCTTC CGGAACCACG GGGCTCCCCA AGGGCGTGAT GTTAACGCAC AAAGGCCTGG 
720 

TGTCCAGCGT TGCCCAGCAG GTCGATGGTG AAAATCCCAA TCTGTATTTC CATTCCGATG 
780 

ACGTGATACT CTGTGTCTTG CC7CTTT7CC ACATC7A77C TCTCAATTCG GT^r-r^CT'— 
840 

GCGCGCTCAG AGCCGGGGC7 GCGACCCTGA TTATGCAGAA ATTCAACCTC ACGACCTGTC 
900 

TGGAGCTGAT TCAGAAATAC AAGG77ACCG T7GCCCCAA7 TGTGCCTCCA ATTGTCCTGG 
960 

ACATCACAAA GAGCCCCATC G777CCCAG7 ACGA7G7CTC GGCCGTCCGG ATAA~<~ A ^GT 
1020 

CCGGCGCTGC GCC7CTCGGG AAGGAACTCG AAGATGCCCT CAGAGAGCGT TTTCCCAAGG 
1080 

CCATTTTCGG GCAGGGCTAC GGCATGACAG AAGCAGGCCC GG7GCTGGCA ATGAA^CTAG 
1140 

CCTTCGCAAA GAATCGTTTC CCCG7CAAAT CTGGC7CCTG CGGAACAGTC GTCCGGAACG 
1200 

CTCAAATAAA GATCCTCGAT ACAGAAACTG GCGAGTCTCT CCCGCACAAT CAAGCCGGCG 
1260 

AAATCTGCAT CCGCGGACCC GAAATAATGA AAGGATATAT TAACGACCCG GAAT r CACGG 
1320 

CCGCTACAAT CGATGAAGAA GGCTGGCTCC ACACAGGCGA CGTCGGGTAC ATTGACGATG 
1380 

ACGAAGAAAT CTTCATAGTC GACAGAGTAA AGGAGATTAT CAAATATAAG GGCTTCCAGG 
1440 

TGGCTCCTGC TGAGCTGGAA GCTTTACTTG TTGCTCATCC G7CAA7CCCT GACGCAGCAG 
1500 

TCGTTCCTCA AAAGCACGAG GAGGCGGGCG AGG7TCCGGT GGCGTTCGTG G7GAAG' r CG T 
1560 

CGGAAATCAG CGAGCAGGAA ATCAAGGAAT TCGTGGCAAA GGAGGTGATT TTCTACAAGA 
1620 

AAATACACAG AGTTTACTTT GTGGATGCGA TTCCTAACTC GCCG7CCGGC AAGATTCTGA 
1680 

GAAAGGATTT GAGAAGCAGA CTGGCAGGAA AATGAAAA7G AA7TTCCATA TGA7TCTAAG 
1740 

A7TCCTTTGC CGATAATTAT AGGATTCCTT TCTG77CAC7 TZ7A777ATA 7AATAAAG7G 
1800 

G7GCAGAG7A AGCGCGG7AT .-AG GAG AG AG AGAGCT7A7C AATTG7ATCA 7ATGGA77GT 
1860 

CAACGCCCTA CAC7C7TGCG ATCGC77TGA ATA7GCATA7 7AC7A7AAAC GA7A7A7G77 
1920 

77TT77ATAA AT77ACTGGA C77C7CG77C AAAAAAAAAA A 
1961 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1010 base pairs 

(B) TYPE: nucleic acid 
■:C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(;<i) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

GACAAACT7G GTCGTTTG7T TAGG7T77GC TGCAGGTGAA CACTAATATG GAAGGCCAGA 
60 

T7GCAGCA7T AAGCAAAGAA GATGAGTTCA 7TTTTCACAG CCCTTTTCCT GCAGTACC T G 
120 

T7CCAGAGAA TATAAGTCTT TTCCAG777G TTCTGGAAGG TGCTGAGAAA. TACCG T GATA 
180 

AGGTGGCCCT CGTGGAGGCG TCCACAGGGA AGGAGTACAA CTATGGTCAG GTGA" , T*T'CG r 
240 
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CGCACGCTGA GAATGGCAAC GGGGTGGAGG 7TGTTGATCC AACGGACTTA ACTGACATCG 
120 

AAGAATGGGA AACCAGGTTA TGACAAGCGT CGCTGCCTGC GGACTGGAAG TTTGGAGTGA 
180 

AGCTTCAAAA CGTTATGGAA GAATCCA77T ACAAGTACAT GCTGGAAACA TTCACCCGCC 
240 

ATCGAGAGGA CGAGGCGTCC AAGGAGCTCT GGGAACGAAC ATGGAACCTG ACACAGAGAG 
300 

GGGAGATGAT GACATTGCCA GATCAGG7GC AGTTCCTGCG CTTGATGGTA AAGATGTCAG 
360 

GTGCTAAAAA GGCATTGGAG ATCGGAG77T TCACTGGCTA TTCATTGCTC AATATCGCTC 
420 

TCGCTCTTCC TTCTGATGGC AAGGTGGTAG CTGTGGATCC AGGAGATGAC CCCAAATTTG 
480 

GCTGGCCC7G CTTCG7TAAG GCTGGAGTTG CAGACAAAGT GGAGATCAAG AAAACTACAG 
540 

GGTTGGACTA TTTGGATTCC CTTATTCAAA AGGGGGAGAA GGATTGCTTC GACTTTGCAT 
600 

TCGTGGACGC AGACAAAGTG AACTACGTGA ACTATCATCC ACGGCTGATG AAGTTAGTGC 
660 

GCGTGGGGGG CGTCATAATT TACGACGACA CCCTC7GGTT TGGTCTGG7G GGAGGAAAGG 
720 

ATCCCCACAA CCTGCTTAAG AATGATTACA TGAGGACTTC TCTGGAGGGT ATCAAGGCCA 
780 

TCAACTGCAT GG7AGCCAAC GACCCCAACT TGGAGGTCGC CACAGTCTT7 ATGGGATA7G 
840 

GTGTCAC7GT TTGTTACCGC ACTGCTTAGT TAGCTAGTCC TCCGTCATTC TGCTA7GTA7 
900 

GTATATGATA ATGGCG7CGA TTTCTGA7AT AGGTGGTTT7 7CAATG7TTC TATCGTCATG 
960 

7TTTC7GT77 AGCCAGAA7G 7TTCGA7CGT CATGGT77C7 G77AAAGCCA GAATAAAATT 
1020 

AGCCGCTTGC AG7TCAAAAA AAAAAAAAAA AAAAAAC7CG AGACTAGTTC 7CTTC 
107 5 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1961 base pairs 

(B) TYPE: nucleic acic 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

GTTTTCCGCC ATTTTTCGCC TGTTTCTGCG GAGAAT7TGA TCAGGTTCGG ATTGGGATTG 
60 

AATCAATTGA AAGGTTT7TA TTTTCAGTAT T7CGATCGCC ATGGCCAACG GAATCAAGAA 
120 

GGTCGAGCAT CTGTACAGAT CGAAGCT7CC CGATATCGAG ATCTCCGACC ATCTGCCTC7 
180 

TCATTCGTAT TGCTTTGAGA GAGTAGCGGA ATTCGCAGAC AGACCCTGTC TGATCGATGG 
240 

GGCGACAGAC AGAACTTATT GCTTTTCAGA GGTGGAACTG ATTTCTCGCA AGGTCGCTGC 
300 

CGGTCTGGCG AAGCTCGGGT 7GCAGCAGGG GCAGGTTGTC ATGCTTCTCC TTCCGAA7TG 
360 

CATCGAATTT GCGTTTGTGT TCATGGGGGC CTCTGTCCGG GGCGCCATTG TGACCACGGC 
420 

CAATCCTT7C TACAAGCCGG GCGAGATCGC CAAACAGGCC AAGGCCGCGG GCGCGCGCGA 
480 

TCATAGTTAC CCTGGCAGCT TATGTGGAGA AAC7GGCCGA TCTGCAGAGC CACGATGTGC 
540 

TCGTCATCAC AATCGATGAT GCTCCCAAGG AAGGTTGCCA ACATATTTCC GTTCTGACCG 
600 
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TCACAAGGAA TGTTGCAGCT GGGCTCGTGG ACAAAGGCAT TCAAAAGGGG GATGTTGTAT 
300 

TTGTTCTGCT TCCAAATATG GCAGAATACC CCATTATTGT GCTGGGAATA ATGTTGGCCG 
360 

GCGCAGTGTT TTCTGGGGCA AATCCTTCTG CACAGATCAA TGAAGTTGAA AAACATATCC 
420 

AGGATTCTGG AGCAAAGATT GTTGTGACAG TTGGGTCTGC TTATGAGAAG GTGAGGCAAG 
480 

TGAAACTGCC TGTTATTATT GCAGATAACG AGCATGTCAT GAACACAA7T CCATTGCAGG 
540 

AAATTTTTGA GAGAAACTAT GAGGCCGCAG GGCCTTTTGT ACAAATTTGT CAGGATGATC 
600 

TGTGTGCACT CCCTTATTCC TCTGGCACCA CAGGGGCCTC TAAAGGTGTC ATGCTCACTC 
660 

ACAGAAATGT GATTGCAAAT CTGTGCTCTA GCTTGTTTGA TGTCCATGAA TCTCTTGTAG 
720 

GAAATTTCAC CACGTTGGGG CTGATGCCAT TCTTTCACAT ATATGGCATC ACGGGCATCT 
780 

GTTGCGCCAC TCTTCGCAAC GGAGGCAAGG TCGTGGTGAT GTCCAGATTC GATCTGCGAC 
840 

ACTTTATCAG TTCTTTGATT ACTTATGAGG TCAACTTCGC GCCTATTGTC CCGCCTATAA 
900 

TGCTCTCCCT CCGGTTTAAA AATCCTATCG TTAACGAGTT CGATCTCAGC CGCTTGAAAC 
960 

TCCAAAGC7G TTCATGACTG CGGCTGCTCC ACTGGCGCCG GATCTACTGC 
101C 

(2) INFORMATION FOR SEQ ID NO: 53: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 741 base pairs 

( B ) TYPE: nucleic acid 

(C) S7RANDEDNESS: single 
;D} TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

GAATTCGGCA C GAG AC CAT 7 TCCAGCTAAT ATTGGCATAG CAATTGGTCA T7CTA7CTTT 
60 

GTCAAAGGAG ATCAAACAAA TTTTGAAA77 GGACCTAATG GTGTGGAGGC TAGTCAGCTA 
120 

TACCCAGATG TGAAATATAC CACTGTCGAT GAGTACCTCA GCAAATTTGT GTGAAGTATG 
180 

CGAGATTCTC TTCCACATGC TTCAGAGATA CATAACAGTT TCAATCAATG TTTGTCCTAG 
240 

GCATTTGCCA AATTGTGGGT TATAATCCTT CGTAGGTGTT TGGCAGAACA GAACCTCCTG 
300 

TTTAGTATAG TATGACGAGC TAGGCACTGC AGATCCTTCA CACTTTTCTC TTCCATAAGA 
360 

AACAAATACT CACCTGTGGT TTGTTTTCTT TCTTTCTGGA ACTTTGGTAT GGCAATAATG 
420 

TCTTTGGAAA CCGCTTAGTG TGGAATGCTA AGTACTAGTG TCCAGAGTTC TAAGGGAGTT 
480 

CCAAAATCAT GGCTGATGTG AACTGGTTGT TCCAGAGGGT GTTTACAACC AACAGTTGTT 
540 

CAGTGAATAA TTTTGTTAGA GTGTTTAGAT CCATCTTTAC AAGGCTATTG AGTAAGGTTG 
600 

GTGTTAGTGA ACGGAATGAT GTCAAATCTT GATGGGCTGA CTGACTCTCT TGTGATGTCA 
660 

AATCTTGATG GATTGTGTCT TTTTCAATGG TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
720 

AAAAAAAAAA AAAAAAAAAA A 
74 1 

{2) INFORMATION FOR SEQ ID NO: 59: 
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(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH; 64 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 59: 

CTCATCTCGG AGTTGCAGGC TGCAGCTTTT GGCCCAAAGC ATGATATCAG ATCAAACGAC 
60 

GCAGATGAAG CAAACGGATC AAACAGTTTG CGTTACTGGA GCAGCGGGTT TCATTGCCTC 
120 

ATGGCTTGTC AAGATGCTCC TCATCAGAGG TTACACTGTC AGAGCAGCAG TTCGGACCAA 
180 

CCCAGCTGAT GATAGGTGGA AGTATGAGCA TCTGCGAGAG TTGGAAGGAG CAAAAGAGAG 
240 

GCTTGAGCT7 GTGAAAGCTG ATATTCTCCA TTACCAGAGC TTACTCACAG TCATCAGAGG 
300 

TTGCCACGGT. GTCTTTCACA TGGCTTCAGT TCTCAATGAT GACCCTGAGC AAGTGATAGA 
360 

ACCAGCAGTC GAAGGGACGA GGAATGTGAT GGAGGCCTGC GCAGAAACTG GGGTGAAGCG 
420 

CGTTGTTTTT ACTTCTTCCA TCGGCGCAGT TTACATGAAT CCTCATAGAG ACCCGCTCGC 
480 

GATTGTCCAT GATGACTGCT GGAGCGATTT GACTACTGCG TACAAACCAA GAATTGGTAT 
540 

TGCTATGCAA AAACCTTGGC AGAGAAATCT GCATGGGATA TTGCTAAGGG AAGGAATTTA 
600 

GAGCTTGCAG TGATAAATCC AGGCCTGGCC TTAGGTCCCT TGA 
643 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i! SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
! D ) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

GAATTCGGCA CGAGAATTTT TCTGTGGTAA GCATATCTAT GGCTCAAACC AGAGAGAAGG 
60 

ACGATGTCAG CATAACAAAC TCCAAAGGAT TGGTATGCGT GACAGGAGCG GCTGGTTACT 
120 

TGGCATCTTG GCTTATCAAG CGTCTCCTCC AGTGTGGTTA CCAAGTGAGA GGAACTGTGC 
180 

GGGATCCTGG CAATGAGAAA AAGATGGCTC ATTTATGGAA GTTAGATGGG GCGAAAGAGA 
240 

GACTGCAACT AATGAAAGCT GATTTAATGG ACGAGGGCAG CTTCGATGAG GTCATCAGAG 
300 

GCTGCCATGG TGTTTTTCAC ACAGCGTCTC CAGTCGTGGG TGTCAAATCA GATCCCAAGA 
360 

TATGGTATGC TCTGGCCAAG ACTTTAGCAG AAAAAGCAGC ATGGGATTTT GCCCAAGAAA 
420 

ACCATCTGGA CATGGTTGCA G 
441 

(2) INFORMATION FCR SEQ IC NO: 61: 

(i: SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 913 base pairs 
IS) TYPE: nucleic acid 

56 



WO 98/11205 



PCT/NZ97/00112 



:C> 5TRANDEDNES5 : singls 
[0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ IT MO: 61: 

GAATTCGGCA CGAGGAAAAC A7CATCCAGG CATTT7GGAA ATTTAGCTCG CCGGT7GAT7 
60 

CAGGATCC7G CAATGGCTTT TGGCGAAGAG CAGACTGCCT TGCCACAAGA AACGCC777G 
120 

AATCCTGCGG TCCATCGAGG AACAGTG7GC GTTACAGGAG CTGCTGGGTT CATAGGGTCA 

iec 

TGGCTCATGA TGCGATTGCT TGAGCGAGGA TATAGTG7TA GAGCAACTGT GCG AG AC AC 7 
240 

GGTAATCC7 3 TAAAGACAAA GCATCTGTTG GATCTGCCGG GGGCAAATGA GAGATTGACT 
300 

C7CTGGAAA3 CAGATTTGGA TCATGAAGGA AGCTT7GA7G CTGCCATTGA TGGGTGTGAG 
360 

GGTGTT77 C7 ATGTTGCCAC TCCCATGGAT TTCGAGTCCG AGGATCCCGA GAATGAGATA 
420 

ATTAAGCCAA CAATCAACGG GGTCTTGAAT GTTATGAGAT C3TGTGCAAA AGCCAAGTCC 
4 60 

GTGAAGCGAG TTGTTTTCAC GTCATCTGCT GGGACTGTGA ATTTTACAGA 7GA.TTTC7AA. 
540 

ACACCAGG_-. AAGTT7TTGA CGAATCATGC TGGACC AACG TGGATCT7TG C AG AAA A. C T T 
600 

AAAATGACAG GATGGATGTA CTTTGTATCG AAGACATTAG CAGAGAAAGC TGCTTGGGAT 
660 

TTTGCAGAGG AGAACAAGAT CGATCTCATT ACTGTTATCC CCACATTGGT CGTTGGACCA 
120 

TTCATTA7GC AGACCATGCC ACCGAGCATG ATCACAGCCT TGGCACTGTT AACGCGGAA.T 
760 

GAACCCCACT ACATGATACT GAGACAGGTA CAGCTGGTTC ACTTGGATGA TCTCTGTATG 
840 

TCACATATCT TTGTATATGA ACATCCTGAA GCAAAGGGCA GATACATCTC TTCCACATGT 
90C 

GATGCTACCC ATT 
913 

\2) INFORMATION FOR SEQ ID NO: 62: 

;i, SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 630 base pairs 
; 3) TYPE : nucleic acid 
:C) STRANDEDNESS: single 
<2) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

GAATTCGGCA CGAGATCAAT TTTTGCATAT TATTAAAAAG TAAGTGTATT CGTTCTCTAT 
60 

ATTGATCAGT CACAGAGTCA TGGCCAGTTG TGGTTCCGAG AAAGTAAGAG GGTTGAATGG 
120 

AGATGAAGCA TGCGAAGAGA ACAAGAGAGT GGTTTGTGTA ACTGGGGCAA ATGGGTACAT 
180 

CGGCTCTTG.G CTGGTCATGA GATTACTGGA ACATGGCTAT TATGTTCATG GAAC7G77AG 
240 

GGACCCAGAA GACACAGGGA AGGTTGGGCA TTTGCTGCGG CTCCCAGGGG CAAGTGAGAA. 
300 

GCTAAAGCTG TTCAAGGCAG AGCTTAACGA CGAAATGGCC TTTGATGATG CTGTGAGCGG 
360 

TTGTCAAGGG GTTTTCCACG TTGCCAAGCC TGTTAATCTG GACTCAAACG CTCTTCAGGG 
420 

GGAGGTTG77 GGTCCTGCGG TGAGGGGAA.C AGTAAA.7CTG CTTCGAGCCT GCGAA.CGAT^ 
480 
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GGGCACTCTG AAACGAGTGA TACATACCTC GTCCGTT7CA GCAGTGAGAT 7CACTGGGAA 
540 

ACCTGACCCC CCTGATACTG TGCTGGATGA ATC7CA7TGG ACTTGGGTCG AGTATTGCAG 
600 

^AAAGACAAAG ATGGTCGGAT GGATGTACTA CATCGCCAAC ACTTA7GCAG AAGAGGGAGC 
560 

CCATAAGTTC GGATCAGAGA 
680 

;2) INFORMATION FOR SEQ ID MO : 6 3 : 

ill SEQUENCE CHARACTERISTICS: 
(A) LENGTH: *i92 base oairs 
O) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
ID) TOPOLOGY: linear 



(XI) 


SEQUENCE D 1 


ESCRI PTION : 


SEQ ID NO: 6 






GAATTCGGCA 
60 

AGATTTCAGA 
120 


CGAGGCTGGT 


TCAAGTGTCA 


GCCC AATGGC 


. 1 ^ w I .-. r\ 


■-AGAATCCC2 


AGAGCTGCTA 


AATCATGAGA 


TCCATCAAGG 


AAGTACAG7A 


TG t GTG AC AG 


GAGCTGCTGG 
180 


CTTCATAGGA 


TCATGGCTCG 


TCATGCGTTT 


GCTTGAGCGA 


GGAT AT ACTG 


TTAGAGGAAC 
240 


TGTGCGAGAC 


ACTGGTAATC 


CGGTGAAGAC 


GAAGCATCTA 


TTGGATCTGC 


CTGGGGCGAA 
300 

ACGCCGCCAT 
360 


TGAGAGGTTA 


ACTCTCTGGA 


AAGCAGATTT 


GGATGATGAA 


G G AAG C T T T G 


TGATGGTTGT 


GAGGGAGTTT 


TCCATGTTGC 


CACTCCCATG 


GATTTTGAAT 


CCGAGGACCC 
42C 

GATCGTGTGG 
480 


CGAGAACGAG 


ATAATTAAAC 


CCGCTGTCAA 


ToGGATGTTG 


»"~LT\ 1 V3 t 1 t 1 Ijn 


GAAAACCAAG 


TCTATGAAGC 


GAGTTGTTTT 


CACGTCGTCT 


GCTGGGACTC 


TGCTTTTTAC 


GG 










492 


2) IN FORMAT 


ION FOR SEQ 


ID NO: c4 : 







(U SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 524 base pairs 
(3) TYPE: nucleic acici 
(C) STRANDEDNESS: single 
;D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

GAATTCGGCA CGAGCTTGTT CAAAGTCACA TATCTTATTT TCTTTGTGAT ATCTGCAATT 
60 

TCCAAGCTTT TCGTCTACCT CCCTGAAAAG ATGAGCGAGG TATGCGTGAC AGGAGGCACA 
120 

GGCTTCATAG CTCCTTATCT CATTCGTAGT CTTCTCCAGA AAGGTTACAG AGTTCGCACT 
130 

ACAGTTCGCA ACCCAGATAA TGTGGAGAAG TTTAGTTATC TGTGGGATCT GCCTGGTGCA 
240 

AACGAAAGAC TCAACATCGT GAGAGCAGAT TTGCTAGAGG AA.GGCAGTTT TGATGCAGCA 
300 

GTAGATGGTG TAGATGGAGT A TTCC AT ACT GCATCACCTG TCTTAGTCCC ATA7AACGK 
360 

CGCTTGAAGG AAACCCTAAT AGATCCTTGT GTGAAGGGCA C7ATCAATGT CCTCAGGTCC 
4 20 

TCTTCAAGAT CACCTTCAGT AAAGCGGGTG GTGCTTACAT CCTCCTGCTC ATCAATAC^G 
4 80 

58 
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AT AC G AC T AT 


AATAGCTTAG 


ACCGTTCCCT 


uC i G v_. T ^- A 


: jTC A 






0 <_ 4 
















2! INFORMATION FOR SEQ 


ID NO : c 5 : 








K 1 ) 


SEQUENCE CHARACTERISE 


CS: 










) LENGTH: 4 17 base pai 


rs 








f 2 


) TYPE: nucleic acid 












) strandedn: 


ESS: single 












! TOPOLOGY: 


1 i near 










fxi; 


SEQUENCE D! 


ESCRIPTION: 


SEQ 10 MO: 


65 : 






T C C T AA T T G T 


TCGATCCTCC 


CTTTTAAAGC 


CCTTCCCTGG 


CCTTCATTCC 


.-.vjGTCACA 


G A 


60 










GTTGTTCATG 
12 0 


CAGTGCTAGC 


A.oG AGGAGCA 


GCGTTGCAA7 


TGGGGAAAA.T 


TCCAAAA.T 


~A 


ATAACGAGA.G 


GACAGAAGTA 


AGTTTGTGGA 


AATAGCA-ACC 


.~ T-j %j G T G T 






18 0 














TCTGGACCGC 


TCTGAGGACA 


ATGGCAAGCT 


CGTTTGTGTC 


A T G G A. T G C G T 


CCAGTTAT 


GT 


2 4 0 














AG G T T T G T G G 


ATTGTTCAG'o 


GCCTTCTTCA 


Av^oAGGCT AT 


TCAG i ^CATG 


CCACGGTG 




200 














ijAGAGACGGT 


■oGCGAGGTTG 


AGTCTCTCAG 


AAAA T T G C .n 7 




. GC AG AT 7 




360 














CT ATGCAGA7 


GTCTTGGATT 


ATCACAGCAT 


T AC T G ATG C G 


CTCAAGGGC? 


— -r» rr* t> ,~ 




417 















(2) INFORMATION FOR SEQ ID NO: 66: 

(i: SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 511 base oairs 
fB) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
■ 0) TOPOLOGY: linear 



( :<i 



60 

CTTCATTCCA TCATCCAGGA GCTTCTGT 
120 

TGAAGAAAAT GGATACGGCG CTTCCAAT 
ISO 

GAGTTTCCTG GGGATTCATA TCGCAAGA 
240 

CGCAATTCCG GTAACGCCAG AAGAGGCA 
300 

GGGGAAGCTG GAGATATGCC AAGCCGAT 
360 

CAATGGTTGC TCCGGAGTCT TCCACGTO 
420 

GGAGTATCCG GTATGATTAG TTTAATAG- 
480 

GAATTTAAG3 TTTTCTTAGA ATTTGGATAC T 
511 

(2) INFORMATION FOR SEQ ID NO: 67: 

(1} SEQUENCE CHARACTERISTICS: 
' A) LENGTH: 609 base cans 
'E) TYPE: nucleic acid 
;C) STRANDEDNESS: sinaie 
0) TOPOLOGY: linear 



59 



SEQ I D NO: 


66 : 




GAGCTTGAAG 


CTC7G7CTTC 


TCTGATA7CG 


T AT C AT ^ ^ f 


^TCru-AATG'j 


ATGCCTACCT 


TCGGAAATTA 


ATGTGCCTTA 


CCGGGGGCTG 


GCTGCTCGGC 


CGGGGTTACT 


CAGTCCGTTT 


CTCACTTATG 


GAATCCGAAG 


AAGCATTATC 


CTTGGATTAT 


CGCAGCGTTT 


TCGGCAACAT 


TGCGCCCTGT 


GATCATCTGG 


ATGGATTACA 


TGACGGGGTA 


TCC7GTATGA 


ATTAGTTTAT 
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SEQUENCE! DESCRIPTION: SEC ID NO:6' ? : 

CA77GATAG7 TGATGGAAGA CCATCAGTAA AGCATGAAAA AGAAA7TGTT CCAAGGTGAA 
60 

GAAGTCAGT7 GCTCCAGCAG AACCTTTTTA GCAATTG7TT TTGTATCCT7 . 77GCC7TTG 
120 

AATATGTAAT CCATAAACTT ATGCAGGAAG TGCCTCGTGC CGAATTCGGC ACGAGAATCA 
ISO 

CTGACCTTCA CATATTTATT CCAATTCTAA TATC7CTACT CGCTGTCTAC C7GA7TTTTC 
240 

AG7GGCGAAC CAACTTGACA GGG7TGGACA TGGCCAACAG CAGCAAGATT 77GATTATTG 
300 

GAGGAACAGG CTACA77GG7 CGTCATATAA CCAAAGCCAG CCTTGCTC77 GGTCATCCCA 
360 

CAT7CC7777 7G7CAGAGAG ACC7CCGC7T CTAATCC7GA GAAGGCTAAG C77C7GGAA7 
420 

CCTTCAAGGG C7CAGGTGG7 A77A7ACTCG A7GGA7C7T7 GGAGGACCA7 GCAAGTCT7G 
4 30 

7GGAGGCAA7 CAAGAAAG77 GA7G7AG7TA 7C7CGGCTG7 CAAGGGACCA CrtGCTGACGG 

7 T C AAAC A G G ATA777A7CG AGGG7AT77A AAGG G AG G G 7 TGGAACCCAT CAAGAAGGG7 
600 

TT7GGCCAA 
609 

.2) IN FORMA 7 ION FOR 5EQ ID NC:68: 

Hi SEQUENCE CHARAC7ERISTICS : 
;A) LENGTH : 474 base pairs 
(3) TYPE: nucleic acid 
'Si STRANDEDNESS : single 
iZ) TOPOLOGY: linear 



■;>;i. SEQUENCE DESCRIPTION: SEQ ID NO:6S: 

GCAAGATAGG 7TTTA7TC77 C7GGAGT7GG GTGAGGCTTG GAAATTTAAG 7AAAAAGGG7 
60 

GCA7AGCAA7 TAAGCAGT7G CAGCCATGGC GGTCTGTGGA AC7GAAC7AG C7CATACTGT 
120 

GC7CTATG7A GC7GCAGACA TGG7GGAAAA CAACACG7C7 A77G7GACCA CCTCTATGoG 
180 

TGCAGCAAA7 TGTGAGATGG AGAAGCC7CT TCTAAATTCC 7CTGCCACCT CAAGAATAC7 
240 

GGTGATGGGA GCCACAGG77 ACATTGGCCG TTTTGTTGCC CAAGAAGCTG 7TGCTGCTGG 
300 

TCATCCTACG TATGCTCTTA TACGCCCGTT TGCTGCTTGT GACC7GGCCA AAGCACAGcL: 
360 

CGTCCAACAA TTGAAGGATG CCGGGGTCCA TATCCTTTAT GGGTCTTTGA GTGATCACAA 
420 

CCTCTTAGTA AATACATTGA AGGACATGGG CCG7TGTTAT CTCTACCATT GGAG 
474 

; 2 } IN FORE-NATION FOR SEQ ID NO: 69: 

(1. SEQUENCE CHARACTERISTICS: 
' A ) LENGTH: 474 base pairs 
; r ) TYPE: nucleic acid 
2 } STRANDEDNESS: single 
.:■} TOPOLOGY : linear 



SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



60 
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GCAAGATAGG TTTTATTCTT CTGGAGTTGG GTGAGGCTTG GAAATTTAAG TAAAAAGGGT 
60 

GCATAGCAAT taaggagttg cagcgatggc ggtctgtgga actgaagtag ctcatactgt 

120 

gctctatgta gctgcagaca tggtggaaaa caacacgtct attgtgacca cctctatggc 

190 

TGCAGCAAAT TGTGAGATGG AGAAGCSTTT TCTAAATTCC TCTGCCACCT CAAGAATACT 

GGTGATGGGA GCCACAGGTT ACATTGGCSG TTTTGTTGCC CAAGAAGCTG TTGCTGCTGG 
300 

TCATCCTACC TATGCTCTTA TACGCCCGTT TGCTGCTTGT GACGTGGCCA AAGCACAGGG 
36C 

CGTCCAACAA TTGAAGGATG CCGGGGTCTA TATCCTTTAT GGGTCTTTGA GTGATCACAA 
420 

CCTGTTAGTA A-ATACATTGA AGGACATGGG CGGTTGTTAT CTCTACCATT GGAG 
474 

!2) INFORMATION TOP. EZQ ID NO: 70: 

■i) SEQUENCE CHARACTERISTICS : 
■A) LENGTH : 609 base cairs 
(B) TYPE: nuclei: ac:: 
•; C } STRANDEDNES5 : single 
•0) TOPOLOGY: Linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: 

CATTGATAGT TGATGGAAGA CCATCAGTAA AGCATGAAAA AGAAATTGTT CCAAGGTGAA 
60 

GAAGTCAGTT GCTCCAGCAG AACGTTTTTA GCAATTGTTT TTGTATCCTT TTTGCCTTTG 
120 

AATATGTAAT CCATAAACTT ATGCAGGAAG TGCCTCGTGC CGAATTCGGC ACGAGAATCA 
180 

CTGACCTTCA AATATTTATT CCAATTCTAA TATCTCTACT CGCTGTCTAC CTGATTTTTC 
240 

AGTGGCGAAC CAACTTGACA GGGTTGGACA TGGCCAACAG CAGCAAGATT CTGATTATTG 
300 

GAGGAACAGG CTACATTGGT CGTCATATAA CCAAAGCCAG CCTTGCTCTT GGTCATCCCA 
360 

CATTCCTTCT TGTCAGAGAG ACCTCCGCTT CTAATCCTGA GAAGGCTAAG CTTCTGGAAT 
4 20 

CCTTCAAGGC CTCAGGTGCT ATT AT ACT GC ATGGATCTTT GGAGGACCAT GCAAGTCTTG 
480 

TGGAGGCAAT CAAGAAAGTT GATGTAGTTA TCTCGGCTGT CAAGGGACCA CAGCTGACGG 
540 

ATCAAACAGG ATATTTATCC AGGGTATTTA AAGGGAGGTT GGAA.CCCATC AAGAAGGGTT 
600 

TTGGCCAA 
608 

(2) INFORMATION FOR SEQ ID NO : 7 1 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1474 base pairs 
O) TYPE: nucleic acid' 
iC) STRANDEDNE3S : singie 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 1 : 

GAATTCGGCA CGAGAAAACG TCCATAGGTT CCTTGGCAAC TGGAAGCAAT ACAGTACAAG 
60 

AGCCAGACGA TCGAATCCTG TGAAGTGGTT CTGAAGTGAT GGGAAGCTTG GAA.TCTGAAA 
120 
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AAACTGTTAC AGGATATCCA 
190 

ATC7CAGAAA GAAAGGACCT 
240 

ACTCTGATT7 AGTTCAAATG 
300 

GGCATGAAGT GGTGGGGATT 
360 

(jAGAGCATGT AGGGGTTGGT 
420 

AGAGCATGGA ACAATACTGC 
•130 

GCACACCTAC TCAGGGCGGA 

GAATCCCGGA GAATCTTCCT 
600 

TT7TCAGC7C AATGAAGCAT 
660 

GTTTAGGAGG CGTGGGGCAC 
720 

CGGTTATCAG TTCG7CTGAT 
730 

C7TATCTTG7 TAGGAAGGAT 
840 

TAATGGACAG CATTCCAGTT 
900 

ATGGAAAGC7 AGTGATGC7G 
960 

TAATACTTGG GAGAAGGAGC 
1020 

AAACTCTAGA TTTCTGTGCA 
1080 

ACTACATCAA CACGGCCATG 
1140 

TGGATGTTGC TAGAAGCAAG 
I20C 

TGCATGCAAG ATGAATAGAT 
1260 

ATTTAGGAA.G TCGATACTGG 

13 20 

TTCAGATG77 TTTTTAACTT 
1380 

7CGAATGTG7 TCTGGCAAAT 

14 40 

AAAAAAAAAA AAAAAAAAAA 
1474 

(2) INFORMATION FOR SEQ ID NO : 7 2 : 

(l! SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1036 base cairs 

(B) TYPE: nucleic acid" 
{C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(xi i SEQUENCE DESCRIPTION: SEQ I D HO : 7 2 : 

GAATTCGGCA CGAGAGAGGG TTATATATC7 TGATTCTGAC '7TGA77GTCG TCGACGAGAT 
60 

TGCCAAGCT7 TGGGCCACGG A7TTGGAA77 TCG7GTCCTC GGGGCACCAG AGTA^TG^AP 
120 

GGCGAATT77 ACAAAGTATT TCACCGATAA TTTCTGGTGG GATCCCGCAT TATCCAAGA^ 
180 

CTTTGAGGGA AAAAAACCCT GCTACTTCAA CACAGGCGTA ATGGTGATCG ATCTTGAAAA 



'oCTCGGCjACT 


^— CAGTGG 1 — 7 A 




7ACACTTAC A 


GAGGATGTAA 


TTGTAAAGGT 


C ATT TACT GC 


GGAATCTGCC 


CGTAATGAAA 


TGGACA7G7C 


7CATTACCCA 


ATGGTCCCTG 


GTAACAGAGA 


T7GGCAGCGA 


Go i. GAAG AAA. 


77CAAAGTGG 


TGCATTGT7G 


GGTCCTGTCG 


CAGTTGCGGT 


AATTGCAAT*" 


AGCAAGAGGA 


TTTGGACCTA 


CAATGATGTG 


A uiTATmr"^ 


TTTGCAAGCA 


GTATGGTGGT 


7GATCAGATG 




7TGGAACAAG 


CGGCCCCTCT 


jTTATGTGC A. 


vj'ju'j i »rt^/^o 


TTCGCCATGA 


CAGAGCCGGG 


.j /-_AG A/^-A T G 7 


G G ^G A T ^ t T C r; 


ATGGG7G7CA 


AGATTGCCAA 






AAAAAGAAAG 


AAGAAGCCAT 




o o *_ o ^ on 1 o 


ACTG.-Ji.AAGA 




*7*. Cj <ta vj r\ % ^ r*. 




GCTCATCC77 


TGGAACCA7A 


•r* ^ *t> ^ ^ 




tjvoCGTTGTTC 


CAGAGCCCTT 


GCAC77CGTG 


AC7CCTC7 r "" 


ATAGCTGGAA 


GTTTCAT7GG 


CAGCATGGAG 




GAGAAGAAGG 


TATCATCGAT 


i ^ T* T* r*t t*"* f"* T* T* 




GAAAGGTTGG 


AGAAGAACGA 


TGTCCGTT AC 




TTGGATAATT 


AGTCTGCAAT 


CAATCAATCA 


G AT C A A T cc<~ 


CTGGACTAGT 


AGCTTAACAT 


■j j"^""*/^ G ■ j v_3 r\^\SK 


TTAAATTT"^ 


TTTTTGTTAC 


TTTAGTT7AG 


- ► i i i j i urtLa 


GTTGAAACAA 


GTATATGTAA 


AGATCAA7TT 


\, » C'sj * oAlAG 


TAAATAATAA 


TAATATA7GT 


ATTCGTATT7 


TTATATGAAA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAA 
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ATGGCGGGCA GGGGAAT7CA CAAGAAAGA7 CGAAATCTCG 
30C 

CCGTATCTAT GAGCTCGGAT CAT7ACCGCC AT777TACTG 
360 

GCAAGTCGAT CATCGTTGGA A7CAGCACGC TTTAGGCGGA 
•12 0 

CCGAGATCTT CACCCTGGAC 77GTCAG77T GTTGCATTGG 
480 

GCTACGCCTG GAATGCCAAG CGGAC77GCC C7CTGGA T AC 
540 

TTTATCGA7C AACGTATTAC CTAAATGGGT GAGAGAGCC7 
600 

ATCGAATTAA ACCTGATTTG ATAAAATGCC AAATAGAAC7 
660 

GTTTTGAATT TCAATTCTGG TAACGAATAG AAG AAAAC AA 
720 

CAAATCCATC ATGAGGGACC AATCG7TTGA AT77AG7AT7 
780 

CGGCTGTGAA. GAATGATAT7 G7GGAC7GAT pt-^TA^AT 

C AG CC AG GAG A G AG G C AA.G C AA7GCCGC7G CAAGTCATGT 
900 

AATTTTCGGC G ACT G 7 AC AG GA7GTAAATT TTTGGAACA7 
3 6 0 

CCTGAACCAA CAACTGTATA A7ACC77A7A AA^GTATCT- 
1020 

AAAAAAAAAA AAAAAAAA 
1036 

(2) INFORMATION FOR SEQ I 3 NO : 7 2 : 

<i) sequence: CHARACTERISTICS: 
(A) LENGTH: 372 base oairs 
(6) TYPE: r.ucieic acid 

(C) STRANDEDNE3S : single 

(D) TOPOLOGY: linear 



! x i ) 


SEQUENCE DESCRIPTION : 


SEQ ID NO:' 


7 ^ • 




CTAGGGGTCT 

60 


TGGGGGGT7C 


CTGATGCCCA 


ATTGTTGCTC 


i '■ J - T - O'-J V„ A. . 


3 AA.C C C AAAA 


CATGCAAG AG 
120 

GTCCTTTTA.G 
180 


ATCTGTACTC 


AG7AGTC77G 


TTGGATCTAT 


AG C77 7 7 AG A 


AAAGAGTCAC 


GGTAACATCA 


TTCCAACCAT 


ATCCAGTTCC 


ACCACCGGCT 


ACACCTTCAA 


CGGGAGGAGG 
240 


AGCAAGATAT 


TCAGCATTGC 


TTTGGGCACC 


AGATGGATAG 


GCATTATTTT 


CCATCGGAAT 
300 


TCAGCCGAGC 


TCGCCCCCTC 


AG7CCAATCG 


TC'oTGriAAAT 


CCCTCAAAAT 


TGGGCAATTC 
360 


TGGCTCGAAA 


TCGCCAAATT 


ATGGGCTACA 


ACAGGA7TAA 


AA T T G C A C A G 


AAATCTGCCA 
372 


GT 










(2 


) INFORMATI 


ON FOR SEQ 


ID NO: 7 4 : 







(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 545 base oairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 -1 : 



A i jvj .-v- AT AC AG AAG G AAC G 
G7A777GC7G 3TTTGGT7AA 
GATAA777GC AAGGCC7TTG 
AGTGG7AAGG GCAAACCTTG 
TTTATGGGCT CCTTATGATC 
CTCTCCTCGG GGTGC77TTT 
TTACGCCTAT GCATC7TTCA 
7AGCACAGCC ACAGGCAGGA 
AATAAGGTTG 77CCATA7AA 
77GTA77GC7 A7GCCATCC7 
A G G G AAG G C G 7 7 G 7 G AAC 7 C 

- -'"-^'777 '-A i 7 7 7 T G C A T AAA. 
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AAAGAATTCG GCACGAGGGC 
60 

GGGAGTTGGC GAGAGAAGCT 
120 

CCAACAAGCC TTTGC7CCCT 
180 

ATCCTGATAn TCTGGGTTAT 
240 

GAAACATAAC CGTAGGAACT 
300 

CTGAAGTGGT TTATGAGCAA 
360 

ATGGGATTCT GGTTGTGGGT 
420 

TGACCATTCC CCTAGGCGGA 
480 

TTGTAA.TCTT GATATCTGGA 
540 

GTTTT 
545 

\2) IN FORMAT 

U; SEQUENCE CHARACTERISTICS : 
(Ai LENGTH: 463 base pairs 
(3) TYPE: nucieic acic 
iC) 3TRANDEDNE3S : single 
(D) TOPOLOGY: linear 



(xi! SEQUENCE DESCRIPTION: SEQ IC NO: "5: 

GCAGGTCGAC ACTAGTGGAT CC.AAAGAATT CGGCACGAGA AAAAACAAAT GTTAGCTAGC 
60 

CTAGTGATGA GCTTTACGTA TACCTGGCCT TTTATACATG GATCTGAGTT TTTATGCAGG 
120 

TGTAGAGCCT TTTGTTACTC TGTATCACTG GGACTTGCCA CAAGCTCTGG AGGACGAATA 
160 

CGGTGGATTT CGTAGCAAAA .AAGTTGTGGA TGACTTTGGC ATATTCTCAG AAGAA.TGCTT 
24 0 

TCGTGCTTTT GGAGACCGTG TGAAGTACTG GGTAACTGTT AACG.AACCGT TGATCTTCTC 
300 

ATATTTTTCT TACGATGTGG GGCTTCACGC ACCGGGCCGC TGTTCGCCTG GATTTGGAAA 
360 

CTGCACTGCG GGAAATTCAG CGACAGAGCC TTATATTGTA GCCCATAACA TGCTTCTTGC 
420 

ACATAGTACC GCTGTTAAAA AT AT AT AG C A TAAATACCCA GGG 
463 

;2) INFORMATION FOR SEQ ID NO: 76: 

( i ; SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 35 base pairs 
•3) TYPE: nucleic acid 
iZ) STRANDEDNESS: single 
(0) TOPOLOGY: linear 



(xi: SEQUENCE DESCRIPTION: SEQ IC NO : 7 £ : 

ACACTAGTGG A T C C AAAG AA TTCGGCACGA GGCTACCATC 7TCCC7CATA ATA.TTGGGCT 
60 

TGGAGCTACC AGGGATCCTG ATCTGGCTAG AAGAATAGGG GCTGCT ACGG CTTTGGAAGT 
120 

TCGAGCTAC7 GGCATTCAA.T AC AC AT TT GC TCCATGTGTT GCTGTTTGCA GAGATCCTCG 
180 



AA.7CCGAGCD TAGCCAACCA ACTTGGCAGC AAGGAGCACA 

GTTAGGAAAT CTTTGGTATT G7TGAAAAAT GGGAAG7CAG 

TTGGAGAAGA ATGCTTCCAA GGTTCTTCTT GCAGGAACCC 

CAGTGTGGTS GATGGACGAT GGAATGCCAA 3GATTAAGTG 

ACAATTCTGG AAGCTATCAA ACTAGCTGTC AGCCCCTCTA 

AATCCAGATG CTAACTAT GT CAAAGGACAA 33GTTT7CAT 

GAGGCACCAT ACGCAGAAAC G ITT Go AG AC AATCTTAATT 

GGGGACACGA TTAAGACGGT C7CTGGCTC3 7TGAAATGCC 
AGGCCACTTG TTATTGAACC T7ATCTTCCA TTGGTGGATC 

ION FOR 3E"2 ID NO : 7 :. : 
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ATGGGGCCG2 TGCTATGAGA GGT AC AGTG A GGATCCA-AAA. ATT 2TCAAGG T7ATGACTG- 
240 

GATTATCGTT GGCCTGCAAG GGAATCCTCC TGCTAATTGT ACAAAAGGGG G3CCTTT~-~ 
300 

ACCTGGACAG TCA.AATGTTO GAGCTTGTGG TAAGCATTTT GTGGGTTATG "TCGAACAA/* 
360 

CAAAGGTA7C GATGAGAA.TA ATACTGTTAT CAACTATCA-A GGGTTATTTC -A.CATTCCAA 
420 

ATTACGCCCA ATTTT 
435 

2; IN FORMATION FOR SEQ 10 NO : " " : 

( i ! SEQUENCE CHARACTERISTICS: 
J A) LENGTH: 4 51 base pairs 
(3) TYPE: rxcle.: acid 
.'CI STRANDEDMES3 : single 
;o; TOPOLOGY: linear 



(:<i; SEQUENCE DESCRIPTION: SEQ IE NO: 77: 

GAATTCGGCA CGAGCC TAG A ATTCTATGGT GAAAATTGTT GGGACAAGGC TGCCCAAGTT 
60 

TACAAAGGAA CAGTCCCAA.A TGGTTAAAGG TTCAATAGAC TATCTAGGCG TTAACCAATA 
120 

CACTGCTTAT TACATGTATG ATCCTAAACA ACCTAAACAA AA.TGTAACAG ATT AC C AG AC 
130 

TGGACTGGAA TACAGGCTTT GCATATGCTC GCAATGGAGT GCCTATTGGA CCAAGGGCGA 
240 

ACTCCAATTG GCTTTACATT GTGCCTTGGG GTCTATACAA GGCCGTCACA TACGTAAAAG 
300 

AACACTATGG AAATCCAACT ATGATTCTCT CTGAAAATGG AATGGACGAC TTGGAAACGT 
360 

GACACTTCCA GCAGGACTGC ATGATACCAT CAGGGGTAAC TACTATAAAA GCTATTT3CA 
420 

.AAATTTGATT AA T G C AC G T G AATGACCGGG G 
451 

[2) INFORMATION FOR SEQ ID NO : ~ z : 

[i: SEQUENCE CHARACTERISTICS: 
IAi LENGTH: 374 base pairs 
:3) TYPE : nucleic acid 
iCJ STRANDEDNESS: sinaie 
( 0 } TOPOLOGY : linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 7 8 : 

CTGCTCTGCA AGCAGTACTA TGCACAGCAA GGCCTGCTTA ACTGAAAACA GAGCGCTGAG 
60 

CTTGAGGAAA CGCTCAAGCA TTGCTGAGGC CACCGTTTAT CTAAATAGCG CAACATAGGG 
120 

CTTCAGAAAA ATGGCAATGG CACAAGCATT CAGAGGCCGT GTCTTGCAAG CTGCCCGTTT 
180 

GCTCCGCCGC AACATTCTC-C CGGAGGATAA AAGCTTTGGA TCCGCTGCTT CTCCTAGACG 
240 

AGCTCTTAGG CTGCTCTCAT CAAAAGCCTT CATCTCTTTC TCTGTTGAAC GGCATCG"C T 
300 

AGCTGCTACA AA T T C AAC ~J\ TTGTGTTGCA ATCTCGAAAC TTTTCTGCAA AAGGTAAAA- 
360 

GACAGGACAA TCTG 
374 

;2) INFORMATION FOR 3EO II NO : 7 ? : 
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(i! SEQUENCE: CHARACTERISTICS: 
(A) LENGTH: 457 base pairs 
(3) TYPE: nucleic acia 
(C) STRANDEDNESS: single 
\D) TOPOLOGY: Linear 



(xi'J SEQUENCE DESCRIPTION; SEQ ID NO : 19 : 

GAAGAATGGA AGAGATTAAT GGTGATAACG CAGTAAGGAG GAGCTGCTTT CCTCCAGGT7 
60 

TCATGTTTGG GATAGCAACT TCTGCTTATC AGTGTGAAGG AGCTCCCAAC GAAGGTGGAA 
120 

AAGGCCCAAG CATCTGGGAC TCATTTTCAC GAACACCAGG CAAAATTCTT GATGGAAGCA 
160 

ACGGTGATGT AGCAGTGGAT C AG TAT CATC GTTATAAGGC AGATG7AAAA CTGATGAAAG 
24 0 

ATATGGGCGT GGCTACCTAC AGATTCTCGA TT7CA7GGCC TCGTATA7TT CCAAAGGGAA 
300 

AAG G AG AG A T C AAT GAG G AA G GAG TAG COT ATTACAATAA CCTCATCAAT GAACTCCTCC 
260 

AGAATGGAAT CCAAGCGTCT GTCAACTTTC TTTCACTG3G ATACTCCCCA GTCTCTGGA-J 
420 

GATGAA.TATG GCGGATTTCT GAGGCCAACC ATTGTC-A 
457 

(2i INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 346 base pairs 
(3) TYPE: nucleic acid 
■:C) STRANDEDNESS: single 
•D) TOPOLOGY: linear 



I xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

GGTGTGATGC CAGGAATTCC AGTCCTAAGG CCATTTTGCA TCTGTTTGCT TTCAGTCTAC 
60 

ATGCTGCACA TTGTAGCTCC AGTAGCTTCA CCAAGGCTAG G T AG AAGC AG CTTCCCAAGG 
120 

GGTTTCAAAT TTGGTGCAGG GTCATCTGCT TATCAGGCGG AAGGA.GCTGC TCATGAGGGT 
130 

GGCAAAGGCC CAAGCATTTG GGATACATTC TCCCACACTC CAGGTAAAAT CGCTGATGGG 
240 

AATATTGGGA TGTTGCAGTA GATCAATACC ACCCTTATAA GGAAGATG7G CAGCTTCTCA 
300 

AATACATG3G AATGGACGTC TATCGTTTCT CTATCTCCTG GTCACG 
346 

(2) INFORMATION FOR SEQ ID NO: 81: 

<i; SEQUENCE CHARACTERISTICS : 
•A) LENGTH: 957 base pairs 
B ) TYPE: nucleic acid 
C) STRANDEDNESS: single 
■' D } TOPOLOGY: linear 



(xi: SEQUENCE DESCRIPTION: SEQ ID NO : B 1 : 

GAATTCGGCA CGAGAAAGCC CTAGAA7TTT TTCAGCATGC TATCACAGCC CCAGCGACAA 
60 

CTTTAACTCC AATAACTGTG GAAGCGTACA AAAAGTTTGT CCTAGTTTCT CTCATTCAGA 
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CTGG7CAGG7 7CCAGCA777 
180 

CTTGCAXTCA GCCCTACAT7 
240 

7GGAAGCT7G TGTCAACACG 
300 

TCAAGCAAG7 TTTGTCATCT 
360 

TGACCCTCTC TCTTC.AAGAC 
420 

AAC7CCATG7 7CTGCAGATG 
430 

ATGGGATGGT GAGCTTCAAT 
540 

AT AT AG AT AC 7GCAA7TCGG 
600 

AGCAGATTTC GTGTGATCAT 
660 

ACATAGATGA 7T7TGATAC7 
7 20 

TCATCTTCAA GACTCGCTTA 

730 

7A.G7ACTG7G GC7GAGTCGA 
84 0 

AAAATCTCAA ATT7C7CGAT 
900 

7GACA777GA GCACC7CGAG 
957 

:2) I N FORMAT ION TOR SEQ ID NO : S 2 : 

(i: SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 439 base oairs 
' B ) TYPE: nucleic acic 
:C) STRANDEDNESS : single 
\2) TOPOLOGY: linear 



ixi; SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

GCAGG7CGAC ACTAGTGGAT CCAAAGAATT CGGCACGAGA 7AAGACTAA7 77TCCAGACA 
60 

ATCCTCCA7T CCCATTCAAT TACACTGGTA CTCCACCCAA TAATACACAG GCTGTGAATG 
120 

GGACTAGAGT AAAAGTCCTT CCC7TTAACA CAACTGTTCA ATTGATTCTT CAAGACACCA 
180 

GCA7C7TCAG CACAGACAGC CACCCTGTCC ATCTCCATGG TTTCAATTTC T7TGTGGTGG 
240 

GCCAAGGTGT TGGAAAC7A.C AATGAATCAA CAGATGCACC AAATTTTAAC CTCATTGACC 
300 

CTGTCGAGAG AAACACTGTG GGAGTTCCCA AAGGAGGTTG GGCTGCTATA AGATTTCGT" 
360 

CAGACAATCC AGGGGTTTGG TTCATGCACT GTCATTTGGA GGT7CACACA TCGTGGGGAC 
4 20 

TGAAAATGGC GTGGGTAGTA AAGAACGGAA AAGGGCCCA7 CGATT7TCCA CCCGGGTGGG 
480 

TACCAGTAA 
489 

(2) INFORMATION FOR SEQ ID NO : 6 3 : 

(. SEQUENCE CHARACTERISTICS: 
'A) LENGTH : 471 base oairs 
;B) TYPE: nucleic acid 
■7) STRANDEDNESS: Sincle 
.j) TOPOLOGY : -inear 



CCAAAA7ACA CACCTGC7G7 
GATT7AGCAA ACAACTACAG 
AACACAGAGA AGTTCAAGAA 
CTTTATAAAC GGAATAT7CA 
ATAGCAAGTA CGGTACAGT7 
ATTCAAGATG GTGAGATTT7 
GAGGATCC7G AACAGTACAA 
AGAATCA7GG CACTA7CAAA 
7CCTACC7GA GTAAGGTGGG 
GTTCCCCAGA AGTTCACAAA 
TATTCA77AC TTTCTATGTG 
GAAAGGA7C7 C7CGGTAT7A 
GTCTAGTC77 GATTTTGATT 
TGAACTACAA AGTTGCATGT 



TG7CCAAAGA AATT7GAAA7 
TAGTGGGAAA A7TTCTG7AT 
TGATAGTAAT 77GGGG77AG 
GAGATTGACA CAGACATATC 
GGAGACTGCT AAGCAGGCTG 
TGCAACCATA AATCAGAAAG 
AACA7GTCAG ATGACTGAA7 
GAAGCTCACC AC AG T AG AT G 
GAGAGAGCGT 7CAAGATT7G 
TA.TGTAACAA A7GATG7AAA 
AATTGATAGT C7GTTAACAA 
TCACTTGACA 7 GC CATC AAA 
A7GAATGCGA CTT7TAG77G 
7 AAAAAAAAA AAAAAAA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

GAATTCGGCA CGAGAAAACC 7TTTCAGACG AATGTTCTGA TGCTCGGCCC CGGCCAGACA 
60 

AC AG AC AT AC TTCTCACTGC CAATCAGGCT ACAGGTAGAT ACTACATGGC TGCTCGAGC- 
120 

TATTCCAACG GGCAAGGAGT TCCC7TCGAT AACACCACTA CCACTGCCAT TTTAGAATAC 
180 

GAGGGAAGC? CTAAGACTTC AACTCCAG7C ATGCCTAATC TTCCATTCTA 7A^CGACACC 
240 

AACAGTGCTA CTAGCTTCGC 7AATGGTCTT AGAAGCTTGG GCTCACACGA CCACCCAGT^ 
300 

TTCGTTCCTC AGAG7GTGGA GGAGAATC73 TTCTACACCA 7C GG7T7GGG GTTGATCAAA 
360 

TGTCCGGGGC AGTCTTGTGG AGG7CCAACG GATCAAGATT TGCAGCAAGT ATGAA7AC£ T 
*120 

ATCAT7TGTC CCGCAACCAC 77CT7CCAA7 CC77CAACC7 CAGCA.7TTTG G 

[2] INFORMATION "Or. 3EQ 10 NO : 3 4 : 

n) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 333 base pairs 
(3) TYPE: nucleic acic 
(C) 3TRANDEDNES5 ; single 
(Z\ TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO : 8 h : 

GTTCGGCACT GAGAGATCCA 77TCTTTCAA TGTTGAGACA GTGAGTAGTA TTAGTTTGAT 
60 

ATCTCTTTCA GGAATATATC GTGCTTGCAG GATCTTTAGT TTCTGCAACA ATGTCG7TGC 
120 

AATCAGTGCG TCTATCTTCT GCTCTCCTTG TTTTGCTACT AGCATTTGTT GCTTACTTAG 
ISO 

TTGCTGTAA3 AAACGCAGAT GTCCACAATT ATACCTTCAT TATTAGAAAG AGACAGTTAC 
240 

CAGGCTATGC AATAAGCGTA TAATCGCCAC CGTCAATGGC AGCTACCAGG CCCAACTATT 
300 

CATGTACG7G ATGGAGACGT TGTTAATTAT CAAAGC7T 
338 

;2) INFORMATION FOR SEQ ID NO : 8 5 : 

{1} SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 1229 base pairs 
(5) TYPE: nucleic acid 
IZ) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 5 : 

AGAGAAATAA 7TATAT7TGT AAATTTAAGT CTACGTTTAT TAAAAAAC7A CAACCCTAAA 
60 

TGCAGGAGAA AAAACAAGCA TGCTGTCTAC TGAAGCT7AC AAATCAAATC CCTGCGATAT 
120 

GTC7TTTC7C GTGCCGAATT CGGC AC GAGA AGATCTTGG7 TCGAGTCTCT CAGCTCTCTC 
180 

CAAAGGAATT TTGTGGGTCA TTTGCAGGTG AAGACACCAT GGTGAAGGC7 TA7CCCACCG 
240 

TAAGCGAGGA GTACAAGGC7 GCCATTGACA AATGCAAGAG GAAGC7CCGA GC7C7CATTG 
30C 
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CAGAGAAGAA CTGTGCGCC3 ATCATGGT7C GAATCGCATG GCACAGCG^T GGGACTTACG 
360 

ATGTCAAGAC CAAGACCGGA GGGCCCTTCG GGACGATGAG ATATG GGGCC GAGCTTGCC~ 
4 20 

ACGGTGCTAA CAGTGGTCTG GAGATCGCAG TTAGGCTCCT GGAGCCAA.TC AAGGAACAG~ 

TCCCCATAAT CACCTATGCT GACCTTTATC AG7TGCCTCG TGTGGTGGCT GTTGAAGTGA 
540 

; CCGGGGGACC TGACATTCCG 7TCCATCCTG GAAGAGAAGA CAAGCCTGAG CCTCCAGAAG 
600 

AAGGCCGCCT TCCTGATGCT ACAAAAGGAC CTGATCA7CT GAGGGATGTT TTTGGTCACA 
660 

TGGGGTTGAA 7 G AT AAGG AA ATTGTGGCCT TGTCTGG7GC CCACACCT7G GGGAGATGCC 
7 2 0 

AC AAGG AG AG ATCTGG7TTT G AAGG AC CAT GGACCTC7AA CCCCCTTATC TTTGACAACT 

7 30 

CTTACTTCAC AGAGCTTGTG ACT GG AG AG A AGGAAGGCCT GCTTCAGTTG CCATC T GATA 
840 

AGGCACTGCT 73CTGATCCT AGTTTTGCAG TTTATGTTCA GAAGTATGCA CAGGACGAAG 
900 

ACGCTTTCT7 TGCTGACTA7 GCGGAAGCTC ACCTGAAGCT TT7TGAACTT GGGTTTGCTG 
960 

ATGCGTAGA7 TCATACCTTC TGCAGAGACA ATTCCT7GCT AGATAGCTTC GTTTTGTATT 
1020 

TCATCTAATC TTTTCGATTA 7ATAGTCACA TAGAAGTTGG TGTTATGCGC CATAGTGATA 
1080 

CTTGAACCTA CATGT77TTG AAAAGTA7CG ATGTTC7TTA AAATGAACAT TGAATACAAC 
114 0 

ATTTTGGAAT CTGGTTGTGT TCTATCAAGC GCATATTTTA A T CG AAT G C T 7CGTTCCTGT 
1200 

TAAAAAAAAA AA T AAAAT AA AAAAAAAAA 



:2) INFORMATION FOR SEQ ID NO : 3 6 : 

!i) SEQUENCE CHARACTERISTICS : 
i*A) LENGTH; 14 10 base oairs 
:3) TYPE: nucleic acid" 
■C) STRANDEDNESG : single 
D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ZZ NO : 8 6 : 

GAAGATGGGG CTGTGGGTGG TGCTGGCTTT GGCGCTCAGT GCGCACTATT GCAGTCTCAG 
60 

GCTTACAATG TGGTAAGTTC AAGCAATGCT ACTGGGAGTT ACAGTGAGAA TGGATTGGTG 
120 

ATGAATTACT ATGGGGACTC TTGCCCTCAG GCTGAAGAGA TCATTGCTGA ACAAGTACGC 
180 

CTGTTGTACA AAAGACACAA GAACACTGCA TTCTCATGGC TTAGAAATAT TTTCCATGAr 
240 

TGTGCTGTGG AGTCATGTGA TGCATCGCTT CTGTTGGACT CAACAAGGAA CAGCATATCA 
300 

GAAAAGGACA CTGACAGGAG CTTCGGCCTC CGCAACTTTA GGTATTTGGA TACCATCAAG 
360 

GAAGCCGTGG AGAGGGAGTG CCCCGGGG7C GTTTCCTGTG CAGATATACT CGTTCT^TC T 
420 

GCCAGAGATG GCGTTGTATC GTTGGGAGGA CCATACATTC CCCTGAAGAC GGGAAGAAGP 
480 

GATGGACGGA AGAGCAGAGC AGATGTGGTG GAGAATTACC TGCCCGATCA CAATGAGAGr 
540 

ATCTCCACTG TTCTGTCTCG CTTCAAAGCC ATGGGAATCG ACACCCGTGG GGTTGTTG^A 
600 

CTGCTGGGGG CTCACAGCGT GGGGAGGACT CACTGCGTGA AG77GGTGCA CAGGCTGTAC 
660 
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- Gvjr.nbTr.u .-. . CCGr-.C ACT A_> AC C 2 T G • _> G AC GT 1- ■ ~ A G ■ 1* /■. " . T G AAG C A 7 AAGTG „ G 3 C 
720 

GACGCGATCC CCAACCCGAA GGCAG7GCAG TATG7GCGGA ACGAC23GGG .-ACGCCTATG 
780 

AAGCTGGACA ACAACTACTA CC7GAACC7G A.TGAACAA.GA AGGGGC7GC7 AATAG7 3GA.C 
340 

CA.GCAAC7G7 ATGCAGATTC GAGGACCAGG CCG7A7G7GA AGAAGA7GGC AAAAAGC "A G 
900 

GAATACTTC7 TCAAATAC77 C7CCCGGGCG CTCACCA7CC TC7C7CAGAA SAATCCT'^ 
960 

ACCGGCGCTC GAG GAG AAA T CCG7C3GCAG TGCTCGCTGA AAAA. C AAA T T CCACAC 1 --- 
1020 

AGCAAGCGTT GAGCGATAGC TCAA7GCCCC AGTGGTGGGA GT GATAGCGT GA7GCCACAG 
1080 

7GGTGGGCAT TTCATATATA AAT73CAG77 TGCG777T7A T7AGA7AA7C A7AA7GG7GT 
LUG 

GG7GTGAC7A TGCCC7GCGA ATCACATGGA TGAACCA2AA CCGAACCGTG AAACAG~-GG 
1200 

G77A7TGGG7 7ATG7AAGCA GAACCTT77A TTATAAGCAA PAAAGACAAT TTTGTCT'TT 
1260 

AT7G7AG7A7 AA77T7GTCA 7CAG7TAAAG T7GCTGA7GT G A7 AA7 AAC 7 2GAAA.CG27A 
1320 

AAA T A 7 G AC A AC7ACG7A77 7~G777GG7C ATCTGA7AA7 AACCGGAAAO "A.^AAA 
1 2 3 0 

GACAACTACA 7A7ATTG777 AAAAAAAAAA 
14 10 

.2} INFORMATION F2-R GEO 10 NO : f. 7 : 

•1) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 687 base pairs 
{ 3 ) T:'PE: nucleic .icic 
(CJ STRAHDEDNESS : single 
(Di TOPOLOGY: linear 



(xu SEQUENCE 2E3CP.I ?TIOM : GEQ 12 NO : k ~ : 

GTAGTTTCGT TTTACAACAA TCTCAGG7TT 7GAATCTCAG AATAGTTGC3 AAAGGAAGC j 
60 

A7GACGAAG7 ACGTGATCG7 7AGC77CATT GTGTGT77GT T7 C7A.7TT3T 77C7GCG7GC 
12C 

A7AA77TC7G TGAATGGATT AG 77 GT CCA 7 GAAGA7CA7C 7 3TCAAAGCC TGTGCATGGG 
180 

CTTTCGTGGA CATT7TATAA GGACAGTTGC CCCGAC77GG AGGCCA7AGT GAAA7CGGTA 
240 

CT7GAGCCGG CGTTGGACGA AGATATCACT CAGGCCGCAC GC77CG7GAG AG77CATT7G 
300 

CATGACTGTT TTGTGCAGGG TTGCGATGGG TCCGTG7TGC TGACAGGAAC 7AAAAGAAAC 
360 

CCCAGTGAGC AACAGGCTCA GCCAAACTTA ACACTAAGAG CCCGGGCCTT GCAGCTGATC 
420 

GACGAAAT7A AAACCGCTGT AGAAGCTACC TGCAGTGGGG 77GTAACTTG TGCAGACATT 
480 

CTGGCTTTGG CTGCTCGTGA CTGGGTCCGC TCAGGAGGC2 2AAAA.TTTCC AG7ACCACTT 
540 

GGCGGCAGAG AT AG OCT AAA G777GCCA3T CAATCCOTAG 7TC7CGCCAA 7 AT AC CPA CT 
600 

CCAACTTTAA ATTTGACACA GCTGATGAAC ATTT77GGC7 CCAAAGGATT CAGTT7GGCC 
66C 

GAAATGGTTG CTCTTCAGGT GGCACAC 
687 

;2) I N FORMAT TON FOR SET ID NO: 99: 
ii) SEQUENCE C 1 i ARA C T E?C ST I C G, : 
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LZMGT:-: 6 z 5 base p*i: 
TYPE? nuc-ei.: a:ia 
5TPANDECHES5 : single 
TOPOLOGY : : i~ear 



3TAGT7TCG7 



G A T G AG G AA. 7 .-\G G T G A 7 G G 
120 

L. A T A-r". T *. TG . -. 7 l ••rvM - u\jn i 

180 

G G T T 7 G G T .-^ C A 7 7 7 7 A 7 A 

-i o 

AC T 7 G AG G G ' 
3 00 

CATGAC7G7* 
3 60 

GCCCGAG7G. ; 

;:o 

A 3 0 



G P " - " ' C M * 
7 ACAGGT7 
7AGC7CGA7 



•J ---".'or'. . L;r-. i 



.-.Lj'jH^.-.Cj . t _) 



MO : 3 8 : 

n»T'T'^ m — > m - rr> m i-n /— 

'G7CAAAGC 
77 G G AG G C G AT AG 



0-L 



A7T7G 



54 0 

— • . 

. '_) j w o .-a . 

60 0 

660 

C G AAA. T G G 7 ' 
688 



jnTAL", G7 .-. i-v 
AA77TGAC 
j - i LTTCrt'jo 7 G o C AC A 1 '.^ 



;A777T7^ 



' G ~AjAAGC 



71 



WO 98/1 1205 



PCT/NZ97/00112 



Claims: 

1. An isolated DNA sequence comprising a nucleotide sequence selected from the 
group consisting of 

(a) sequences recited in SEQ ID NO: 3. 13. 16-70. 72-88: 

(b) complements o; the sequences recited in SEQ ID NO: 3. 13. 16-70. 72- 
88: 

(c) reverse complements of the sequences recited in SEQ ID NO: 3. 13. 16- 
70. 72-88: 

(d) reverse sequences of the sequences recited in SEQ ID NO: 3. 13. 16-70. 
72-88 and 

(e) sequences having at least about a 99% probability of being the same as a 
sequence of (a) - (d) as measured by computer algorithm l ; ASTA. 

2. A DNA construct comprising a DNA sequence according to claim I 

3. A transgenic cell comprising a DNA construct according to claim 2. 

4. A DNA construct comprising, in the 5"-3' direction: 

(a) a gene promoter sequence. 

(b) an open reading frame coding for at leas: a functional portion of an 
enzyme encoded by a nucleotide sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 3, 13. 16-70. 72-88 and 
sequences having at least about a 99% probability of being the same as a 
sequence of SEQ ID NO: 3. 13. 16-70. 72-88 as measured by computer 
algorithm FASTA; and 

(c) a gene termination sequence. 

5. The DNA construct of claim 4 wherein the open reading frame is in a sense 
orientation. 
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6. The DNA construct of claim 4 wherein the open reading frame is in an antisense 
orientation. 

7. The DNA construct of claim 4. wherein the gene promoter sequence and gene 
termination sequences are functional in a plant host. 

S. The DNA construct of claim 4. wherein the gene promoter sequence provides 
for transcription in xylem. 

9. The DNA construct of claim 4 further comprising a marker for identification of 
transformed cells. 

10. A DNA construct comprising, in the 5'-3' direction: 

(a) a gene promoter sequence. 

(b) a non-codine region of a gene coding for an enzyme encoded by a 
nucleotide sequence selected from the group consisting of sequences 
recited in SEQ ID NO: 3, 13, 16-70. 72-88 and sequences having at least 
about a 99% probability of being the same as a sequence of SEQ ID NO: 
3. 13, 16-70. 72-88 as measured by computer algorithm FASTA: and 

(c) a gene termination sequence. 

11. The DNA construct of claim 10 wherein the non-coding region is in a sense 
orientation. 

12. The DNA construct of claim 10 wherein the non-coding region is in an 
antisense orientation. 

13. The DNA construct of claim 10. wherein the gene promoter sequence and gene 
termination sequences arc functional in a plant host. 

14. The DNA construct of claim 10. wherein the gene promoter sequence provides 
for transcription in xylem. 
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15. A transgenic plant cell comprising a DNA construct, the DNA construct 
comprising in the 5'-3* direction: 

(a) a gene promoter sequence; 

(h) an open reading frame coding for at least a functional portion of an 
enzyme encoded by a nucleotide sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 3. 13. 16-70. 72-88 and 
sequences having at least about a 9 C >% probability of being the same as a 
sequence of SEQ ID NO: 3, 13. 16-70. 72-88 as measured by computer 
algorithm FASTA: and 

(c) a gene termination sequence. 

16. The transgenic plant cell of claim 15 wherein the open reading frame is in a 
sense orientation. 

17. The transgenic plant cell of claim 15 wherein the open reading frame is in an 
antisense orientation. 

18. The transgenic plant cell of claim 15 wherem the DNA construct further 
comprises a marker for identification of transformed cells. 

19. A plant comprising a transgenic plant celi according to claim 15. or fruit or 
seeds thereof. 

20. The plant of claim 19 wherein the plant is a woody plant. 

21. The plant of claim 20 wherein the plant is selected from the croup consisting of 
eucalyptus and pine species. 

22. A transgenic plant cell comprising a DNA construct, the DNA construct 
comprising, in the 5* -3' direction: 

fa) a gene promoter sequence: 
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(b) a non-coding region of a gene coding for an enzyme encoded bv a 
nucleotide sequence selected from the group consisting of sequences 
recited in SEQ (D NO: 3, 13, 16-70, 72-88 and sequences having at least 
about a 99% probability of being the same as a sequence of SEQ ID NO: 
3. 13. 16-70. 72-88 as measured by computer algorithm FASTA: and 

(c) a gene termination sequence. 

23. The transgenic plant cell of claim 22 wherein the non-coding region is in a sense 
orientation. 

24. The transgenic plant ceil of claim 22 wherein the non-coding region is in an 
antisense orientation. 

25. A plant comprising a transgenic plant cell according to claim 22. or fruit or 
seeds thereof. 

26. The plant of claim 25 wherein the plant is a woody plant. 

27. The plant of claim 26. wherein the plant is selected from the group consisting* of 
eucalyptus and pine species. 

28. A method for modulating the lignin content of a plant comprisinu stably 
incorporating into the genome of the plant a DNA construct comprising, in the 
5*-3* direction: 

(a) a gene promoter sequence; 

(b) an open reading frame coding for at least a functional portion of an 
enzyme encoded by a nucleotide sequence selected from the croup 
consisting of sequences recited in SEQ ID NO: 3. 13. 16-70, 72-88 and 
sequences having at least about a 99% probability of being the same as a 
sequence of SEQ ID NO: 3. 13, 16-70. 72-88 as measured by computer 
algorithm FASTA: and 

(c) a gene termination sequence. 
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29. The method of claim 28 wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 

30. The method of claim 28 wherein the open reading frame is in a sense 
orientation. 

31. The method of claim 28 wherein the open reading frame is in an antisense 
orientation. 

32. A method for modulating the lignin content of a plant comprising stably 
incorporating into the genome of the plant a DNA construct comprising, in the 
5*-V direction: 

(a) a gene promoter sequence; 

(b) a non-coding region of a gene coding for an enzyme encoded by a 
nucleotide sequence selected from the group consisting of sequences 
recited in SEQ ID NO: 3. 13, 16-70. 72-88 and sequences having at least 
about a 99% probability of being the same as a sequence of SEQ ID NO: 
3, 13. 16-70. 72-88 as measured by computer algorithm FASTA; and 

(c) a gene termination sequence. 

33. The method of claim 32 wherein the non-coding region is in a sense orientation. 

34. The method of claim 32 wherein the non-coding region is in an antisense 
orientation. 

35. The method of claim 32 wherein the plant is a woody plant. 

36. The method of claim 35. wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 

37. A method for producing a plant having altered lignin structure comprising: 
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(a) transforming a plant cell with a DNA construct comprising, in the 5*-3" 
direction, a gene promoter sequence, an open reading frame codine for at 
least a functional portion of an enzyme encoded by a nucleotide 
sequence selected from the group consisting of sequences recited in SEQ 
ID NO: 3, !3. 16-70. 72-88 and sequences having at least about a 99% 
probability of being the same as a sequence of SEQ ID NO: 3, 13, 16-70. 
72-88 as measured by computer algorithm FASTA. and a tiene 
termination sequence to provide a transgenic cell: 

(b) cultivating the transgenic cell under conditions conducive to 
regeneration and mature plant growth. 



38. The method of claim 37 wherein the open reading frame is in a sense 
orientation. 

39. The method of claim 37 wherein the open reading frame is in an antisense 
orientation. 

40. The method of claim 37 wherein the plant is a woody plant. 

41. The method of claim 40 wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 

42. A method for producing a plant having altered lignin structure comprising: 

(a) transforming a plant cell with a DNA construct comprising, in the 5"-3" 
direction, a gene promoter sequence, a non-coding region of a uene 
coding for an enzyme encoded by a nucleotide sequence selected from 
the group consisting of sequences recited in SEQ ID NO: 3, 13, 16-70. 
72-88 and sequences having at least about a 99% probability of bein^ 
the same as a sequence of SEQ ID NO: 3, 13. 16-70. 72-88 as measured 
by computer algorithm FASTA. and a gene termination sequence to 
provide a transgenic cell: 
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(b) cultivating the transgenic cell under conditions conducive to 
regeneration and mature plant growth. 

43. The method of claim 42 wherein the non-coding region is in a sense orientation. 

44. The method of claim 42 wherein the non-coding region is in an antisense 
orientation. 

45. The method of claim 42 wherein the plant is a woody plant. 

46. The method of claim 45 wherein the piant is selected from the group consisting 
of eucalyptus and pine species. 

47. A method of modifying the activity of an enzyme in a plant comprising stably 
incorporating into the genome of the plant a DNA construct including 

(a) a gene promoter sequence; 

(b) an open reading frame coding for at least a functional portion of an 
enzyme encoded by a nucleotide sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 3. 13. 16-70. 72-88 and 
sequences having at least about a 99% probability of being the same as a 
sequence of SEQ ID NO; 3. 13. 16-70. 72-X<S as measured by computer 
algorithm FASTA; and 

(c) a gene termination sequence. 

48. The method of claim 47 wherein the open reading frame is in a sense 
orientation. 

49. The method of claim 47 wherein the open reading frame is in an antisense 
orientation. 

50. A method of modifying the activity of an enzyme in a plant comprising stably 
incorporating into the genome of the plant a DNA construct including 
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( a) a gene promoter sequence; 

(b) a non-coding region of a gene coding lor an enzyme encoded by a 
nucleotide sequence selected from the group consisting of sequences 
recited in SEQ ID NO: 3, 13, 16-70. 72-88 and sequences having at least 
about a 99% probability of being the same as a sequence of SEQ ID NO: 
3, 13, 16-70, 72-S8 as measured by computer algorithm FASTA; and 

(c ) a gene termination sequence. 

5 1 . The method of claim 50 wherein the non-coding region is in a sense orientation. 

52. The method of claim 50 wherein the non-coding region is in an anusense 
orientation. 

53. The method of claim 50 wherein the plant is a woody plant. 

54. The method of claim 53 wherein the plant is selected from the group consisting 
of eucalyptus and pine species. 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 3,17,48,49 encoding 
cinnamate 4-hydroxylase (C4H) , plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



2. Claims: 1-54 partially 

Isolated DNA sequences of ID nos 18,50-52 encoding coumarate 
3-hydroxylase (C3H) , plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



3. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 35,36,81 encoding 
phenolase (PNL) , plant expression constructs incorporating 
said sequences, methods to modulate lignin content, 
structure, and enzyme activity in plants, transgenic plants 
and plant cells containing said constructs. 



4. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 22-25,53-55 encoding 
0-methyl transferase (OMT), plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



5. Claims: 1-54 all partially 

Isolated DNA sequence of ID no 30, encoding cinnamyl alcohol 
dehydrogenase (CAD), plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



6. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 26-29,58-70 encoding 
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Box 1 Observati ns where certain claims were found unsearchable (Continuation of item 1 of first she t) 



This International Search Report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons: 
1 . I Claims Nos.: 

because they relate to subject matter not required to be searched by this Authority, namely: 



Claims Nos.: 

because they relate to parts of the International Application that do not comply with the prescribed requirements to such 
an extent that no meaningful International Search can be carried out, specifically: 



3. Claims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box li Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 

see additional sheet 

1 . ) I As all required additional search fees were timely paid by the applicant, this International Search Report covers all 

searchable claims. 

2. j | As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 

of any additional fee. 



3. I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
' ■ covers only those claims for which fees were paid, specifically claims Nos.. 



4. | | No required additional search fees were timely paid by the applicant. Consequently, this International Search Report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 



Remark on Protest | X | The additional search fees were accompanied by the applicant's protest. 

| j No protest accompanied the payment of additional search fees. 
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12. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 13,42-44,85-88 encoding 
peroxidase (POX) , plant expression constructs incorporating 
said sequences, methods to modulate lignin content, 
structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



13. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 19-21 encoding 
ferulate-5-hydroxylase (F5H), plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 
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cinnamoyl-CoA reductase (CCR) , plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



7. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 16,45-47 encoding 
phenylalanine ammonia lyase (PAL), plant expression 
constructs incorporating said sequences, methods to modulate 
lignin content, structure, and enzyme activity in plants 
using said constructs, and transgenic plants and plant cells 
containing them. 



8. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 56,57 encoding 
4~coumarate:CoA ligase (4CL), plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



9. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 31-33,72 encoding coniferol 
glucosyl transferase (CGT) , pi ant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



10. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 34,73-80 encoding coniferin 
beta-glucosidase (CBG) , plant expression constructs 
incorporating said sequences, methods to modulate lignin 
content, structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 



11. Claims: 1-54 all partially 

Isolated DNA sequences of ID nos 37-41,82-84 encoding 
laccase (LAC) , plant expression constructs incorporating 
said sequences, methods to modulate lignin content, 
structure, and enzyme activity in plants using said 
constructs, and transgenic plants and plant cells containing 
them. 
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